Seven Sample Notebooks

File uploaded by Andrew Flint Advocate on Feb 17, 2017Last modified by Makenna.Brei on Jun 21, 2017
Version 12Show Document
  • View in full screen mode

This .zip contains 7 sample notebooks that you might find helpful to get started. After you download and unzip the file, use the Import button to upload them into Analytics Workbench:





As of this version, the zip contains:

  1. Tutorial #1: Your First Notebook — a very simple introduction to the use and layout of notebooks
  2. Simple Queries — a Scala and SQL notebook that first retrieves a file from S3, and then makes it available for a series of interactive SQL queries and visualizations
  3. Data Access v1.2.1 — demonstrates how to access data already loaded in AW using Python, R, SQL and Scala, and how to use data from other places in an AW notebook
  4. Machine Learning with HELOC Data — a fairly thorough example of machine learning techniques on the credit risk modeling dataset called HELOC, almost entirely in Python
  5. Construct Flight_Delays Dataset — a Python and SQL notebook that retrieves (and samples) Flight Delay data from an AWS S3 bucket, stores into AW as a dataset, and queries the results
  6. ML & SQL on AW Datasets — trains and evaluates a Random Forest machine learning model on Flights data, using the Spark ML library (in Python), and explores the same data using SQL queries
  7. Quick Interpreters Test — a quick test of all 8 supported interpreters (languages). You might use this if you suspect something is wrong with our Zeppelin service: maybe one or more interpreters need to be restarted.


If you're just beginning, I'd start with the Tutorial #1, just to get your bearings. Then I'd move on to Simple Queries, to interact with some interesting SQL tricks. Next, take a look at the Data Access notebook to see how to access data that's already loaded in AW in a few different languages. The Machine Learning with HELOC example is fairly comprehensive and will likely become more so over time. Finally, (if the 2007-short and 2008-short datasets aren't yet in AW) fetch the airline data using Flight Dataset Construction, and then run some of the fun stuff in ML & SQL on AW Datasets.


We expect to add more sample notebooks over the next many weeks (lots more!), so check back often. Also, we're seeking a way to make these far more accessible, from directly within AW.


If you'd like more help downloading and installing these sample notebooks, please read How to Use Sample Notebooks from the Community.


And if you've got samples of your own that you'd like to share, please do!