Use the lecture's Jupyter Notebook as a starting point.

Anomaly Detection

  1. Head over to the Oslo City Bike webpage and download their 2016 data.
  2. Load the data into Spark and parse the timestamps of the starts of the trips.
  3. We'll now repeat the steps from the lecture.
  4. Advanced. Repeat the above analysis, using counts per-day-and-station. Try to identify anomalous days for a given station.