Use the templates repository for boilerplate code.
Testing
- Write a set of unit tests for any non-trivial class or function from previous lectures or homework assignments.
- Document a function and run doctests on it.
- Sign up for twitter here and create your first app here. Note down your consumer key, consumer secret, access token, and access token secret in a safe location.
- Install the
requests
and requests_oauthlib
packages and, using the code from the lecture, collect the most common domains linked by Twitter statuses of a topic of your choice.
- Using
mathplotlib
, visualize the results of the previous task.
- Advanced: Write a generator to help you with the task of extracting information. It should continue yielding domains indefinitely. Break the task down in small functions and write unit tests for each of those.
MongoDB
- Install MongoDB and the
pymongo
package on your computer.
- Create a small database of information of your choice. You can do this manually, record by record (a handful is enough), or you can try storing some of the information you scraped last week.
- Experiment with retrieving and updating information.
- Advanced: Write a few aggregations on your data.
- Advanced, but recommmended: Use your Twitter code to store statuses for a given topic over a longer period of time, using the
filter
streaming API. A meaningful period could be e.g. 12 or 24 hours. Make a plot of the per-hour interest in your topic.