Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2015"
From Cohen Courses
Jump to navigationJump to searchLine 12: | Line 12: | ||
* Tues Sep 15. [[Class meeting for 10-605 Phrases_with_Stream_and_Sort|Implementing Phrase Finding with Stream-and-Sort]] | * Tues Sep 15. [[Class meeting for 10-605 Phrases_with_Stream_and_Sort|Implementing Phrase Finding with Stream-and-Sort]] | ||
** Also: Guest lecture from Manik Varma, MSR. | ** Also: Guest lecture from Manik Varma, MSR. | ||
− | * Thus Sep 17. [[ | + | * Thus Sep 17. [[Class_meeting_for_10-605_Hadoop_Overview|Hadoop Overview]] |
** HW2 out: naive Bayes training on Hadoop in Java. [https://drive.google.com/file/d/0BzQQ-spWKjhUd0NXSTB6TW82LWM/view PDF Handout] | ** HW2 out: naive Bayes training on Hadoop in Java. [https://drive.google.com/file/d/0BzQQ-spWKjhUd0NXSTB6TW82LWM/view PDF Handout] | ||
* Tues Sep 22. [[Class_meeting_for_10-605_Rocchio_and_On-line_Learning|Rocchio and TFIDF]] | * Tues Sep 22. [[Class_meeting_for_10-605_Rocchio_and_On-line_Learning|Rocchio and TFIDF]] |
Revision as of 14:29, 17 September 2015
This is the syllabus for Machine Learning with Large Datasets 10-605 in Fall 2015.
Notes:
- Homeworks, unless otherwise posted, will be due when the next HW comes out.
- Lecture notes and/or slides will be (re)posted around the time of the lectures.
- Tues Sep 1. Overview of course, cost of various operations, asymptotic analysis.
- Thus Sep 3. Review of probabilities, joint distributions and naive Bayes
- Tues Sep 8. Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large feature sets.
- HW1 out: streaming naive Bayes in Java. PDF Handout
- Thus Sep 10. Phrase Finding
- Tues Sep 15. Implementing Phrase Finding with Stream-and-Sort
- Also: Guest lecture from Manik Varma, MSR.
- Thus Sep 17. Hadoop Overview
- HW2 out: naive Bayes training on Hadoop in Java. PDF Handout
- Tues Sep 22. Rocchio and TFIDF
- Thus Sep 24. Fast KNN and similarity joins
- Tues Sep 29. Scalable SGD and Hash Kernels
- HW3 out: applying a large linear classifier to a large test set in Hadoop.
- Thus Oct 1. TBA
- For 805 students: an initial project proposal is due. You will get feedback on it from the instructors, and it will also be posted to the class - mainly for 605 students that are interested in collaborating, but also for general interest.
- Tues Oct 6. Parallel Perceptrons 1.
- Thus Oct 8. Parallel Perceptrons 2.
- Tues Oct 13. Parameter servers and AllReduce
- HW4 out: streaming logistic regression classifier
- Thus Oct 15. Matrix Factorization and SGD
- For 805 students: the final project proposal is due.
- Tues Oct 20. guest lecture from Mark Torrance of RocketFuel
- Thus Oct 22. midterm exam
- Tues Oct 27. Randomized Algorithms 1
- Thus Oct 29. Randomized Algorithms 2
- HW5 out: (tentatively) SGD with a parameter server
- Tues Nov 3. Scalable PageRank
- Thus Nov 5. Subsampling a graph with RWR
- Tues Nov 10. SSL on Graphs
- HW6 out: (tentatively) sDSG for matrix factorization
- Thus Nov 12. Graph models for large-scale ML
- Tues Nov 17. Sparse sampling and parallelization for LDA
- Thus Nov 19. Speeding up LDA-like models: All-reduce and other tricks
- Tues Nov 24. TBA
- HW7 out: TBA
- Thus Nov 26. Happy Thanksgiving!
- Tues Dec 1. First-order logics
- Thus Dec 3. Scalable First-order logics
- Tues Dec 8. Scalable spectral clustering techniques.
- HW7 due
- Thus Dec 10. In-class exam.