CS034/CS134, Spring 2010
Machine Learning and Statistical Data Analysis

Course description

This course provides an introduction to statistical modeling and machine learning. Topics include learning theory, supervised and unsupervised machine learning, statistical inference and prediction. A wide variety of algorithms will be presented, including K-nearest neighbors, naive Bayes, decision trees, support vector machines, logistic regression, K-means, mixtures of Gaussians, principal components analysis, Expectation Maximization. The course will also discuss modern applications of machine learning such as image segmentation and categorization, speech recognition, and text processing.

Administrative information

Instructor
Lorenzo Torresani | 109 Sudikoff | office hour: by appointment
Teaching assistant
Qingyuan Kong | 222 Sudikoff | office hour: Th 6-8pm or by appointment
Course staff email
cs134 -at- cs -dot- dartmouth -dot- edu
Lectures
T&Th 10-11:50am | x-hour (used occasionally to make up cancelled classes) W 3-3:50
006 Kemeny Hall
Lab
Sudikoff 001: Linux machines with Matlab. As an alternative, you can use Matlab on your machine by following the instructions provided here.
Textbook (recommended but not required)
Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer 2006

Grading and policies

Grading scheme
The course grade will be based 10% on in-class participation, 40% on the homework assignments (each of the four homework assignments will count for 10% of the final grade), and 50% on the term project. The homeworks will require answering questions and implementing some algorithms in Matlab, but prior knowledge of Matlab is not required. During x-hours on March 31st and April 7th, we will present a tutorial covering the basics of Matlab. In addition, the College will offer a Matlab class on April 14th (see details here).
Late homeworks
Each student has 3 free late days to be used over the course of the term as he/she likes. Once these days are used up, any homework turned in late will be penalized 25% per late day. No homework will be accepted more than 3 days after its due date. No exception! The late days can be used only for the homeworks, not for the project submissions. Any portion of a late day is counted as one full day. Assignments are typically due at 11:59 pm of the due date. The code portion of each homework submission must be turned in via Blackboard. The answers to technical questions can either be written on paper and left in the course mailbox near the Sudikoff entrance or be submitted in electronic form via Blackboard.
Auditing
Please contact the instructor if you would like to audit the course.

Academic integrity

You may discuss the assignments with other current CS034/134 students, but your submission must be entirely your own work. That is, your code and any other solutions you submit must be created, written/typed, and documented by you alone. You may not copy anything directly from another student's work. For example, memorizing or copying onto paper a portion of someone else's solution would violate the honor code, even if you eventually turn in a different answer. Similarly, e-mailing a portion of your code to another student, or posting it on-line for them to see would violate the honor code. We do encourage discussion of assignments between students, subject to these rules.

You cannot make use of any code taken from outside references for your homeworks, unless explicitly authorized to do so by the instructor. As a rule of thumb, you should treat any external code as software written by another CS034/134 student: you are not allowed to copy it or to use it as a template to implement your solution.

You are allowed to use external software for your project. However, you should clearly report the use of external code and include pointers to such software in your project write-up. The project grade will be based on the novelty of your solution/application but also on the amount of new code written by you to implement the idea. So keep this in mind when considering to use software written by someone else.

These rules will be strictly enforced and any violation will be treated seriously