Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016"
From Cohen Courses
Jump to navigationJump to searchLine 60: | Line 60: | ||
* Thus Nov 24. ''No class - happy Thanksgiving!'' | * Thus Nov 24. ''No class - happy Thanksgiving!'' | ||
* Tues Nov 29. [[Parameter servers]] | * Tues Nov 29. [[Parameter servers]] | ||
− | ** HW7 out: LDA with a param server ([http://curtis.ml.cmu.edu/w/courses/images/1/16/Hw7-lda-ps.pdf | + | ** HW7 out: LDA with a param server ([http://curtis.ml.cmu.edu/w/courses/images/1/16/Hw7-lda-ps.pdf draft handout]) |
== December == | == December == |
Revision as of 16:58, 25 July 2016
This is the syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016.
Notes:
- Homeworks, unless otherwise posted, will be due when the next HW comes out.
- Lecture notes and/or slides will be (re)posted around the time of the lectures.
note: this is under construction
Contents
September
- Thus Sep 1. Overview of course, cost of various operations, asymptotic analysis.
- Tues Sep 6. Review of probabilities, joint distributions and naive Bayes
- HW1A out: streaming naive Bayes. draft Handout
- Thus Sep 8. Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large feature sets.
- Tues Sep 13. Phrase Finding and Hadoop
- HW1B out: naive Bayes training on Hadoop. draft Handout
- Hadoop Overview
- Phrase Finding
- Thus Sep 15. Implementing Phrase Finding and Large-Data Testing for Naive Bayes with Stream-and-Sort.
- Lecture also discusses: map-reduce abstractions/dataflow
- Tues Sep 20. Hadoop Workflow Languages and Rocchio and TFIDF
- HW3 out: Using workflow languages.
- Thus Sep 22. Hadoop Workflow Languages and Rocchio and TFIDF continued
- Lecture also discusses: hadoop streaming, mrjob, cascading, pipes, scaling, hive, pig, spark, flink
- Tues Sep 27. Fast KNN and similarity joins
- Thus Sep 29. Scalable SGD and Hash Kernels
- HW4 out: streaming logistic regression classifier PDF Handout
- For 805 students: an initial project proposal is due via email to wcohen+805@gmail.com. You will get feedback on it from the instructors, and it will also be posted to the class - mainly for 605 students that are interested in collaborating, but also for general interest. Please be clear about your proposal. I'm expecting approximately one page. You should discuss what dataset you plan to use, what results you hope to obtain, what baseline technique you will build on and/or compare to. Also include a section saying if you have a partner; and if you are willing to work with/mentor one or more 605 students, and if so, how you anticipate them contributing to the project.
October
- Tues Oct 4. No class - Rosh Hashana.
- Thus Oct 6. Parallel Perceptrons 1.
- Tues Oct 11. Parallel Perceptrons 2.
- Thus Oct 13. More on parallel and streaming ML: Adaptive gradients, AllReduce, and Parameter Servers
- Also, some exam review tips (ppt
- practice questions for midterm - v1. This document also references the relevant questions from two previous review sheets:
- William's note - revised, and discuss param servers more later on
- Tues Oct 18. midterm exam
- Thus Oct 20. Class meeting for 10-605 Reverse-mode differentiation and Deep Learning 1
- HW4 out: Implementing autograd light
- Tues Oct 25. Class meeting for 10-605 Reverse-mode differentiation and Deep Learning 2
- William's note: will include some material from Matrix Factorization and SGD
- For 805 students: the final project proposal is due.
November
- Tues Nov 1. Scalable PageRank
- Thus Nov 3. SSL on Graphs
- HW5 out: SSL on Spark
- Tues Nov 8. Randomized Algorithms 1
- Tues Nov 10. Randomized Algorithms 2
- Tues Nov 15. Randomized Algorithms 3
- HW6 out: parallel deep learning in Spark
- Thus Nov 17. Sparse sampling and parallelization for LDA
- Tues Nov 22. LDA 2
- Thus Nov 24. No class - happy Thanksgiving!
- Tues Nov 29. Parameter servers
- HW7 out: LDA with a param server (draft handout)
December
- Thus Dec 1. Graph models for large-scale ML
- Tues Dec 6. Review and project presentations (15 min each)
- HW7 due
- Thus Dec 8. In-class exam.
- Tues Dec 15. Writeup for 10-805 projects are due (at 11:59pm).