Difference between revisions of "Machine Learning 10-601 in Fall 2013"

From Cohen Courses
Jump to navigationJump to search
Line 50: Line 50:
 
These are proposed project ideas, for discussion by the 10-601 teaching team (Eric, William, and the TAs).  No final decisions have been made.
 
These are proposed project ideas, for discussion by the 10-601 teaching team (Eric, William, and the TAs).  No final decisions have been made.
  
* Building a deep belief network learning system (Eric's proposal).  The project is to implement and train a deep belief network, and test it on the [http://www.image-net.org/challenges/LSVRC/2012/ ImageNet] dataset (the full dataset is 10M images, 10k categories, but we will choose a smaller subset for experimental purposes.)
+
* Building a deep belief network learning system (Eric's proposal).  The project is to implement and train a deep belief network, and test it on the [http://www.image-net.org/challenges/LSVRC/2012/ ImageNet] dataset (the full dataset is 10M images, 10k categories, but we will choose a smaller subset for experimental purposes.) This could be run as a Kaggle-like competition.
  
* Building and evaluating an out-of-the-box classifier learner (William's proposal).  Some learning algorithms require more tuning to a new problem than others, but most of what is known about how to tune classifiers for a learning task is folklore, not science.  The question here is: which algorithms are most robust?
+
* Building and evaluating an out-of-the-box classifier learner (William's proposal).  Some learning algorithms require more tuning to a new problem than others, but most of what is known about how to tune classifiers for a learning task is folklore, not science.  The question here is: which algorithms are most robust? To address this I suggest a competition with these rules.
 +
** Submitted learners will be scored by their average error rates (say) over 5 evaluation learning tasks, each of which has an associated train/test split.
 +
** The evaluation tasks are not known in advance - instead there are 20 development learning tasks, each of which has an associated train/test split, to tune the learning system.
 +
** The learning system could be one of several things:
 +
**# A plain classifier learner (eg, a standard implementation of random forests might be a good baseline)
 +
**# A classifier learner with a wrapper around it that does a parameter sweep and picks a set of parameters.
 +
**# A set of K classifier learners, which uses internal cross-validation to pick the best set.

Revision as of 14:15, 26 July 2013

Instructor and Venue

  • Instructors: William Cohen and Eric Xing, Machine Learning Dept and LTI
  • Course secretary: Sharon Cavlovich, sharonw+@cs.cmu.edu, 412-268-5196
  • When/where: M/W 1:30-2:50, Doherty Hall 2315
    • Classes will start on Wednesday, Sept 4 (the Wed after Labor Day)
  • Course Number: ML 10-601
  • TAs:
    • Gopala Anumanchipalli (gka@andrew.cmu.edu)
    • William Yang Wang (ww@cmu.edu)
    • Guanyu Wang (wgiveny@gmail.com)
    • Shu-Hao Yu ([junglesam9595@hotmail.com)
    • plus others TBD....
  • Syllabus: Syllabus for Machine Learning 10-601.
  • Office hours:
    • TBD
  • Email and forum:
    • TBD

Description

Machine Learning (ML) asks "how can we design programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience, e.g. spotting high-risk medical patients, recognizing speech, classifying text documents, detecting credit card fraud, or driving autonomous robots.

Topics covered in 10-601 include concept learning, version spaces, decision trees, neural networks, computational learning theory, active learning, estimation & the bias-variance tradeoff, hypothesis testing, Bayesian learning, the Minimum Description Length principle, the Gibbs classifier, Naïve Bayes classifier, Bayes Nets & Graphical Models, the EM algorithm, Hidden Markov Models, K-Nearest-Neighbors and nonparametric learning, reinforcement learning, genetic algorithms, bagging and boosting.

10-601 focuses on the mathematical, statistical and computational foundations of the field. It emphasizes the role of assumptions in machine learning. As we introduce different ML techniques, we work out together what assumptions are implicit in them. We use the Socratic method whenever possible, and student participation is expected. Grading is based on written assignments, programming assignments, and a final exams.

10-601 focuses on understanding what makes machine learning work. If your interest is primarily in learning the process of applying ML effectively, and in the practical side of ML for applications, you should consider Machine Learning in Practice (11-344/05-834).

10-601 is open to all but is recommended for CS Seniors & Juniors, Quantitative Masters students, and non-SCS PhD students.

Syllabus

Previous syllabi, for the historically-minded:

Prerequisites

Prerequisites are 15-122, Principles of Imperative Computation AND 21-127: Concepts of Mathematics.

Additionally, a probability course is a co-requisite: 36-217: Probability Theory and Random Processes OR 36-225: Introduction to Probability and Statistics I

A minimum grade of 'C' is required in all these courses.

Projects

These are proposed project ideas, for discussion by the 10-601 teaching team (Eric, William, and the TAs). No final decisions have been made.

  • Building a deep belief network learning system (Eric's proposal). The project is to implement and train a deep belief network, and test it on the ImageNet dataset (the full dataset is 10M images, 10k categories, but we will choose a smaller subset for experimental purposes.) This could be run as a Kaggle-like competition.
  • Building and evaluating an out-of-the-box classifier learner (William's proposal). Some learning algorithms require more tuning to a new problem than others, but most of what is known about how to tune classifiers for a learning task is folklore, not science. The question here is: which algorithms are most robust? To address this I suggest a competition with these rules.
    • Submitted learners will be scored by their average error rates (say) over 5 evaluation learning tasks, each of which has an associated train/test split.
    • The evaluation tasks are not known in advance - instead there are 20 development learning tasks, each of which has an associated train/test split, to tune the learning system.
    • The learning system could be one of several things:
      1. A plain classifier learner (eg, a standard implementation of random forests might be a good baseline)
      2. A classifier learner with a wrapper around it that does a parameter sweep and picks a set of parameters.
      3. A set of K classifier learners, which uses internal cross-validation to pick the best set.