Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2015"
From Cohen Courses
Jump to navigationJump to searchLine 2: | Line 2: | ||
Notes: | Notes: | ||
− | * The assignments posted | + | * The assignments, when posted, maybe '''drafts''' based on the assignments from spring 2015, and will be modified over the course of the semester - some may be changed substantially. Due dates may also change by a few days. |
* Lecture notes and/or slides will be (re)posted around the time of the lectures. | * Lecture notes and/or slides will be (re)posted around the time of the lectures. | ||
Line 11: | Line 11: | ||
* Thus Sep 10. [[Class meeting for 10-605 Phase Finding|Messages, records and workflows; Phrase finding.]] | * Thus Sep 10. [[Class meeting for 10-605 Phase Finding|Messages, records and workflows; Phrase finding.]] | ||
* Tues Sep 15. [[Class meeting for 10-605 Hadoop 1|Hadoop and Map-Reduce]] | * Tues Sep 15. [[Class meeting for 10-605 Hadoop 1|Hadoop and Map-Reduce]] | ||
+ | * Thus Sep 17. [[Class meeting for 10-605 PIG|PIG and Other Workflow Systems for Hadoop]] | ||
** HW2 out: naive Bayes training on Hadoop in Java. | ** HW2 out: naive Bayes training on Hadoop in Java. | ||
− | |||
* Tues Sep 22. [[Class_meeting_for_10-605_Rocchio_and_On-line_Learning|Rocchio and TFIDF]] | * Tues Sep 22. [[Class_meeting_for_10-605_Rocchio_and_On-line_Learning|Rocchio and TFIDF]] | ||
− | |||
* Thus Sep 24. [[Class meeting for 10-605 Similarity Joins|Fast KNN and similarity joins]] | * Thus Sep 24. [[Class meeting for 10-605 Similarity Joins|Fast KNN and similarity joins]] | ||
Line 20: | Line 19: | ||
* Tues Sep 29. [[Class meeting for 10-605 SGD and Hash Kernels|Scalable SGD and Hash Kernels]] | * Tues Sep 29. [[Class meeting for 10-605 SGD and Hash Kernels|Scalable SGD and Hash Kernels]] | ||
+ | ** HW3 out: applying a large linear classifier to a large test set in Hadoop. | ||
* Thus Oct 1. TBA | * Thus Oct 1. TBA | ||
* Tues Oct 6. [[Class meeting for 10-605 Parallel Perceptrons 1|Parallel Perceptrons 1]]. | * Tues Oct 6. [[Class meeting for 10-605 Parallel Perceptrons 1|Parallel Perceptrons 1]]. | ||
+ | ** HW4 out: streaming logistic regression classifier | ||
* Thus Oct 8. [[Class meeting for 10-605 Parallel Perceptrons 2|Parallel Perceptrons 2]]. | * Thus Oct 8. [[Class meeting for 10-605 Parallel Perceptrons 2|Parallel Perceptrons 2]]. | ||
* Tues Oct 13. Parameter servers and AllReduce | * Tues Oct 13. Parameter servers and AllReduce | ||
* Thus Oct 15. [[Class meeting for 10-605 SGD for MF|Matrix Factorization and SGD]] | * Thus Oct 15. [[Class meeting for 10-605 SGD for MF|Matrix Factorization and SGD]] | ||
− | * Tues Oct 20. TBA | + | * Tues Oct 20. TBA - tentative, guest lecture |
+ | ** HW4 due | ||
* Thus Oct 22. ''midterm exam'' | * Thus Oct 22. ''midterm exam'' | ||
Line 31: | Line 33: | ||
* Tues Oct 27. [[Class meeting for 10-605 Randomized|Randomized Algorithms 1]] | * Tues Oct 27. [[Class meeting for 10-605 Randomized|Randomized Algorithms 1]] | ||
+ | ** HW5 out: (tentatively) matrix factorization with a parameter server | ||
* Thus Oct 29. [[Class meeting for 10-605 Randomized|Randomized Algorithms 2]] | * Thus Oct 29. [[Class meeting for 10-605 Randomized|Randomized Algorithms 2]] | ||
* Tues Nov 3. [[Class meeting for 10-605 Subsample A Graph|Scalable PageRank]] | * Tues Nov 3. [[Class meeting for 10-605 Subsample A Graph|Scalable PageRank]] | ||
* Thus Nov 5. [[Class meeting for 10-605 Subsampling Graphs|Subsampling a graph with RWR]] | * Thus Nov 5. [[Class meeting for 10-605 Subsampling Graphs|Subsampling a graph with RWR]] | ||
* Tues Nov 10. [[Class_meeting_for_10-605_SSL_on_Graphs|SSL on Graphs]] | * Tues Nov 10. [[Class_meeting_for_10-605_SSL_on_Graphs|SSL on Graphs]] | ||
+ | ** HW6 out: TBA | ||
* Thus Nov 12. [[Class meeting for 10-605 GraphLab|Graph models for large-scale ML]] | * Thus Nov 12. [[Class meeting for 10-605 GraphLab|Graph models for large-scale ML]] | ||
* Tues Nov 17. [[Class meeting for 10-605 LDA 1|Sparse sampling and parallelization for LDA]] | * Tues Nov 17. [[Class meeting for 10-605 LDA 1|Sparse sampling and parallelization for LDA]] | ||
* Thus Nov 19. [[Class meeting for 10-605 2013 LDA 2|Speeding up LDA-like models: All-reduce and other tricks]] | * Thus Nov 19. [[Class meeting for 10-605 2013 LDA 2|Speeding up LDA-like models: All-reduce and other tricks]] | ||
* Tues Nov 24. TBA | * Tues Nov 24. TBA | ||
+ | ** HW7 out: TBA | ||
* Thus Nov 26. ''Happy Thanksgiving!'' | * Thus Nov 26. ''Happy Thanksgiving!'' | ||
Line 46: | Line 51: | ||
* Thus Dec 3. [[Class meeting for 10-605 Scalable FOL|Scalable First-order logics]] | * Thus Dec 3. [[Class meeting for 10-605 Scalable FOL|Scalable First-order logics]] | ||
* Tues Dec 8. [[Class meeting for 10-605 Spectral Clustering|Scalable spectral clustering techniques.]] | * Tues Dec 8. [[Class meeting for 10-605 Spectral Clustering|Scalable spectral clustering techniques.]] | ||
+ | ** HW7 due | ||
* Thus Dec 10. In-class exam. | * Thus Dec 10. In-class exam. | ||
Revision as of 17:08, 23 August 2015
This is the syllabus for Machine Learning with Large Datasets 10-605 in Fall 2015.
Notes:
- The assignments, when posted, maybe drafts based on the assignments from spring 2015, and will be modified over the course of the semester - some may be changed substantially. Due dates may also change by a few days.
- Lecture notes and/or slides will be (re)posted around the time of the lectures.
- Tues Sep 1. Overview of course, cost of various operations, asymptotic analysis.
- Thus Sep 3. Review of probabilities, joint distributions and naive Bayes
- Tues Sep 8. Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large feature sets.
- HW1 out: streaming naive Bayes in Java.
- Thus Sep 10. Messages, records and workflows; Phrase finding.
- Tues Sep 15. Hadoop and Map-Reduce
- Thus Sep 17. PIG and Other Workflow Systems for Hadoop
- HW2 out: naive Bayes training on Hadoop in Java.
- Tues Sep 22. Rocchio and TFIDF
- Thus Sep 24. Fast KNN and similarity joins
- Tues Sep 29. Scalable SGD and Hash Kernels
- HW3 out: applying a large linear classifier to a large test set in Hadoop.
- Thus Oct 1. TBA
- Tues Oct 6. Parallel Perceptrons 1.
- HW4 out: streaming logistic regression classifier
- Thus Oct 8. Parallel Perceptrons 2.
- Tues Oct 13. Parameter servers and AllReduce
- Thus Oct 15. Matrix Factorization and SGD
- Tues Oct 20. TBA - tentative, guest lecture
- HW4 due
- Thus Oct 22. midterm exam
- Tues Oct 27. Randomized Algorithms 1
- HW5 out: (tentatively) matrix factorization with a parameter server
- Thus Oct 29. Randomized Algorithms 2
- Tues Nov 3. Scalable PageRank
- Thus Nov 5. Subsampling a graph with RWR
- Tues Nov 10. SSL on Graphs
- HW6 out: TBA
- Thus Nov 12. Graph models for large-scale ML
- Tues Nov 17. Sparse sampling and parallelization for LDA
- Thus Nov 19. Speeding up LDA-like models: All-reduce and other tricks
- Tues Nov 24. TBA
- HW7 out: TBA
- Thus Nov 26. Happy Thanksgiving!
- Tues Dec 1. First-order logics
- Thus Dec 3. Scalable First-order logics
- Tues Dec 8. Scalable spectral clustering techniques.
- HW7 due
- Thus Dec 10. In-class exam.