# Machine Learning 10-601 in Spring 2016

## Contents

## Announcements

- Solutions to the final exam are available.
- HW7 is released here. Due at 5:30 pm, Apr. 28, 2016.
- HW6 is released here. Due at 5:30 pm, Apr. 14, 2016.
- HW5 is released here. Due at 5:30 pm, Mar. 31st, 2016.
- Midterm solution is released.
- HW4 is released here. Due at 5:30 pm, Mar. 3rd, 2016.
- HW3 is released here. Due at 5:30 pm, Feb. 18, 2016.
- HW2 is released here. Due at 5:30 pm, Feb. 4, 2016
- HW1 is released here. Due at 5:30 pm, Jan. 21, 2016
- Important announcements will be made here as well as on Piazza.

## Important People and Places

- For Section 10-601 - Section B
- Instructor:
- William Cohen, Machine Learning Dept and LTI
- Maria-Florina Balcan, Machine Learning Dept and CSD
- When/where: Mon/Wed 10:30-11:50am, GHC 4401
- Sandy Winkler, sandyw@cs.cmu.edu is the course secretary.
- TAs:
- María De Arteaga (mdeartea@andrew.cmu.edu)
- Travis Dick (tdick@cs.cmu.edu)
- William Herlands (herlands@cmu.edu)
- Renato Negrinho (negrinho@cs.cmu.edu)
- Tianshu Ren (tren@andrew.cmu.edu)
- Han Zhao (han.zhao@cs.cmu.edu)
- Zichao Yang (zichaoy@cs.cmu.edu)

- We'll be using Autolab for most assignments, and Piazza for general Q/A. The lectures are recorded by MediaTech.
- Autolab Page for assignments
- Piazza Page - announcements and discussion. Note: you should sign up for Piazza with your
**andrew email**, because we will be using Piazza polls to monitor attendence. - MediaTech page- recorded lectures

- 10-601b-instructors@lists.andrew.cmu.edu is a mailing list that goes to the TAs and professors only.

## Recitation Schedule

Recitations are normally held:

- Mon 7:00pm, GHC 4303
- Thus 5:30pm, GHC 4303

Schedule for the next few weeks:

- Thus 1/14: Will Herlands, review and tour of Matlab
- Mon 1/18: Holiday, no recitation
- Thus 1/21: María, Naive Bayes review and exercises
- Mon 1/25: María, Naive Bayes review and exercises

## Office hours Schedule

Instructor | Day | Time | Location |
---|---|---|---|

William | Thus | 11am | GHC 8217 |

Nina | Monday | 3:30pm - 4:30pm | GHC 8017 |

Han | Thursday | 3pm - 4pm | GHC 8007 |

Will H | Monday | 12pm - 1pm | HBH 3049a |

María | Tuesday | 10am - 11am | NSH 3128 |

Tianshu | Tuesday | 3:30pm - 4:30pm | NSH A408 |

Zichao | Thursday | 12:30pm - 1:30pm | GHC 6004 |

Renato | Wednesday | 3:30pm - 4:30pm | GHC 8208 |

Travis | Friday | 5:00pm - 6:00pm | GHC 6008 |

Note: William will be out of the country and will not hold office hours 1/28 - 3/10.

## Description

Machine Learning (ML) asks "how can we design programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience, e.g. spotting high-risk medical patients, recognizing speech, classifying text documents, detecting credit card fraud, or driving autonomous robots.

Topics covered in 10-601 include concept learning, version spaces, decision trees, neural networks, computational learning theory, active learning, estimation & the bias-variance tradeoff, hypothesis testing, Bayesian learning, Naïve Bayes classifier, Bayes Nets & Graphical Models, the EM algorithm, Hidden Markov Models, K-Nearest-Neighbors and nonparametric learning, reinforcement learning, bagging and boosting, neural networks, and other topics.

10-601 focuses on the mathematical, statistical and computational foundations of the field. It emphasizes the role of assumptions in machine learning. As we introduce different ML techniques, we work out together what assumptions are implicit in them. Grading is based on written assignments, programming assignments, and a final exam.

10-601 focuses on understanding what makes machine learning work. If your interest is primarily in learning the process of applying ML effectively, and in the practical side of ML for applications, you should consider Machine Learning in Practice (11-344/05-834).

10-601 is open to all but is recommended for CS Seniors & Juniors, Quantitative Masters students, and non-SCS PhD students.

## Syllabus and Text

Syllabus for Machine Learning 10-601, including lecture slides and HWs

Previous syllabi, for the historically-minded:

- Nina and Tom's class from Spring 2015
- Syllabus for Machine Learning 10-601 in Fall 2014 - William and Ziv's class from fall 2014
- Syllabus for Machine Learning 10-601 in Fall 2013 - William and Eric Xing's class from fall 2013
- Ziv's 701 lectures
- Ziv's class with Tom fall 2012
- Roni's 10-601 syllabus

Recommended Texts:

- Tom Mitchell's textbook, Machine Learning
- Machine Learning: a Probabilistic Perspective, K. Murphy, MIT Press, 2012
- Pattern Recognition and Machine Learning, Christopher Bishop, Springer-Verlag 2006

## Prerequisites

Formal prerequisites:

- Prerequisites are 15-122, Principles of Imperative Computation AND 21-127: Concepts of Mathematics.
- Additionally, a probability course is a co-requisite: 36-217: Probability Theory and Random Processes OR 36-225: Introduction to Probability and Statistics I
- A minimum grade of 'C' is required in all these courses.

Self-assessment for students:

- Students, especially graduate students, come to CMU with a variety of different backgrounds, so formal course prereqs are hard to establish. There is a short self-assessment test to see if you have the necessary background for 10-601. We recommend that all students take this before enrolling in 10-601 to see if they have the necessary background knowledge already, or if they need to review and/or take additional courses.

Refresher material:

A few resources that can help you review the math required to do well in a machine learning course:

- Linear Algebra: Review, Cheatsheet
- Probability: Review, Cheatsheet

Some other reviews you might be interested in:

- Zico Kolter, a prof in CSD, has put up a set of video lectures that review linear algebra.
- Very recently, Aaditya Ramdas, a grad student in MLD, has put up some video reviews of multivariate calculus and multivariate probabilities and stats.

To assess whether you need to watch these, you should do the self-assessment test, which is linked to on the wiki.

## Grading Policy

- 50% for homeworks. There are 6 and you can drop 1.
- 20% for midterm
- 20% for final
- 10% for class participation.

## Projects

- TBA

## Policies and FAQ

### FAQ

**Can I take the class pass/fail? Or, can I audit?**Our policy is to give priority to students that are taking the class for a grade, so you cannot sign up for the class pass/fail or as an audit unless the waitlist clears. However, I expect that this spring there will be no waitlist. (In assigning a pass or fail, we will consider the letter grade you would be assigned and the requirements of the program that you are in, and assign a pass if your program would consider your computed grade as a pass.)**Can I get an extension on ....?**Generally no, but you can get 50% credit for up to 48 hrs after the assignment is due, and you can drop your lowest assignment grade. If you have a documented medical issue or something similar email the instructors.**What do I need to do if I want to audit?**attend the lectures and sit for the mid-term and final, and quizzes. You don't need to study for the exams - mainly I'm interested to know how much you've absorbed in an audit.

### Policy on Collaboration among Students

These policies are the same as were used in Dr. Rosenfeld's version of 2013.

The purpose of student collaboration is to facilitate learning, not to circumvent it. Studying the material in groups is strongly encouraged. It is also allowed to seek help from other students in understanding the material needed to solve a particular homework problem, provided no written notes are shared, or are taken at that time, and provided learning is facilitated, not circumvented. The actual solution must be done by each student alone, and the student should be ready to reproduce their solution upon request.

**The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved**, on the first page of their assignment. Specifically, each assignment solution must start by answering the following questions:

(1) Did you receive any help whatsoever from anyone in solving this assignment? Yes / No. If you answered 'yes', give full details: _______________ (e.g. "Jane explained to me what is asked in Question 3.4") (2) Did you give any help whatsoever to anyone in solving this assignment? Yes / No. If you answered 'yes', give full details: _______________ (e.g. "I pointed Joe to section 2.3 to help him with Question 2".

Additionally, if you share any material or collaborate in any way between the time the assignment is due and the last time when the assignment can be handed in for partial credit, you must notify the instructor of this help in writing (eg via email).

Collaboration without full disclosure will be handled severely, in compliance with CMU's Policy on Cheating and Plagiarism. Except in usual extenuating circumstances, the policy is to **fail the student(s) for the entire course**.

As a related point, some of the homework assignments used in this class may have been used in prior versions of this class, or in classes at other institutions. Avoiding the use of heavily tested assignments will detract from the main purpose of these assignments, which is to reinforce the material and stimulate thinking. Because some of these assignments may have been used before, solutions to them may be (or may have been) available online, or from other people. It is explicitly forbidden to use any such sources, or to consult people who have solved these problems before. **You must solve the homework assignments completely on your own**. I will mostly rely on your wisdom and honor to follow this rule, but if a violation is detected it will be dealt with harshly. Collaboration with other students who are currently taking the class is allowed, but only under the conditions stated below.