This course studies the management of large bodies of information. This includes schemes for the representation, manipulation, and storage of complex information structures as well as algorithms for processing these structures efficiently and for retrieving the information they contain. The course does include programming in Python. Topics include Relational Databases as well as NOSQL (Not Only SQL) data stores.

If you have enrolled in this class, please complete homework 0 as soon as possible, preferably before the first day of class.


This course is based upon the course designed by Prof. Amit Chakrabarti. Other guidance came from Prof. Jennifer Widom’s DB courses at Stanford, now available through, and Will Cross and Norberto Leite of the MongoDB team. In addition, invaluable assistance from graduate students Ray Jenkins and Lixing Lian has greatly improved the course. This instructor is deeply indebted to these outstanding educators.