Five Tutorial

File uploaded by Andrew Flint on Jun 21, 2017Last modified by Andrew Flint on Aug 8, 2017
Version 4Show Document
  • View in full screen mode

Here is a bundle of 5 notebooks and a sample dataset to introduce you to using notebooks in Analytics Workbench.



After you download and unzip the file, use the Import button to upload each of the .json files into Analytics Workbench. For more help installing sample notebooks, please read How to Use Sample Notebooks from the Community.


As of this version, the zip contains:

  1. Tutorial #1: Your First Notebook — a very simple introduction to the use and layout of notebooks
  2. Tutorial #2: Basic Data Access — demonstrates how to access data already loaded in AW using Python, R, SQL and Scala
  3. Tutorial #3: SQL Queries and Visualization — a Scala and SQL notebook that first retrieves a file from S3, and then makes it available for a series of interactive SQL queries and visualization
  4. Tutorial #4: Construct Flight Delays Dataset — a Python and SQL notebook that retrieves (and samples) Flight Delay data from an AWS S3 bucket, stores into AW as a dataset, and queries the results
  5. Tutorial #5: Machine Learning on HELOC Data — a fairly thorough example of machine learning techniques on the credit risk modeling dataset called HELOC, almost entirely in Python


Tutorial notebooks #2 and #5 rely on a HELOC dataset that is also included in this .zip archive, and the remaining notebooks draw their data from the internet. After you have downloaded and unzipped the .zip archive attached below, find the file "HELOC_with_scores_trees.csv.bz2" (it's a bzipped CSV file), and upload it as-is in Data > New Dataset. (This step is necessary only if you wish to run notebooks #2 and #5.)


As always, if you have samples of your own that you'd like to share, by all means, share them!