Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2012"
From Cohen Courses
Jump to navigationJump to search (→March) |
(→March) |
||
Line 36: | Line 36: | ||
** '''Assignment due: memory-efficient SGD''' | ** '''Assignment due: memory-efficient SGD''' | ||
** ''New assignment: mini-project proposals (first draft).'' | ** ''New assignment: mini-project proposals (first draft).'' | ||
− | * Thus Mar 8. Guest Lecture: Joey Gonzales, GraphLab and Dynamic Asynchronous Computation | + | * Thus Mar 8. Guest Lecture: Joey Gonzales, CMU, GraphLab and Dynamic Asynchronous Computation |
* Tues Mar 13. ''no class - spring break.'' | * Tues Mar 13. ''no class - spring break.'' | ||
* Thus Mar 15. ''no class - spring break.'' | * Thus Mar 15. ''no class - spring break.'' | ||
Line 42: | Line 42: | ||
** '''Assignment due: mini-project proposals (first draft).''' | ** '''Assignment due: mini-project proposals (first draft).''' | ||
** ''New Assignment: Subsampling and visualizing a graph.'' | ** ''New Assignment: Subsampling and visualizing a graph.'' | ||
− | * Thus Mar 22. Tentative: Guest lecture by U Kang. | + | * Thus Mar 22. Tentative: Guest lecture by U Kang, CMU. |
* Tues Mar 27. Gibbs sampling and LDA. | * Tues Mar 27. Gibbs sampling and LDA. | ||
** '''Assignment due: Subsampling and visualizing a graph.''' | ** '''Assignment due: Subsampling and visualizing a graph.''' |
Revision as of 11:54, 6 February 2012
This is the syllabus for Machine Learning with Large Datasets 10-605 in Spring 2012.
Contents
January
- Tues Jan 17. Overview of course, cost of various operations, asymptotic analysis.
- Thus Jan 19. Review of probabilities.
- Tues Jan 24. Streaming algorithms and Naive Bayes.
- New Assignment: streaming Naive Bayes 1 (with feature counts in memory). PDF Handout
- Thus Jan 26. The stream-and-sort design pattern; Naive Bayes revisited.
- Tues Jan 31. Messages and records 1; Phrase finding.
- Assignment due: streaming Naive Bayes 1 (with feature counts in memory).
- New Assignment: streaming Naive Bayes 2 (with feature counts on disk) with stream-and-sort. PDF Handout
February
- Thus Feb 2. More on streaming algorithms: Rocchio, and theory of on-line learning
- Tues Feb 7. More on streaming algorithms: parallelized voted perceptron.
- Assignment due: streaming Naive Bayes 2 (with feature counts on disk) with stream-and-sort
- New Assignment: phrase finding with stream-and-sort
- Thus Feb 9. Map-reduce and Hadoop 1 (Alona lecture).
- Tues Feb 14. Map-reduce and Hadoop 2. (Alona lecture).
- Assignment due: phrase finding with stream-and-sort
- New Assignment: Naive Bayes with Hadoop
- Thus Feb 16. Naive Bayes and Logistic regression.
- Tues Feb 21. Logistic regression with stochastic gradient descent, parallel SGD
- New Assignment: Phrase-finding with Hadoop
- Thus Feb 23. Tentative: Guest lecture on Scaling up Machine Learning, Ron Bekkerman, LinkedIn
- Tues Feb 28. Bloom Filters and Locality sensitive hashing 1.
- Hadoop assignments due
- New Assignment: memory-efficient SGD
March
- Thus Mar 1. Bloom Filters and Locality sensitive hashing 2.
- Tues Mar 6. Learning on graphs. PageRank, Harmonic field, RWR; tools and design patterns for graphs (Pregel, GraphLab, Schimmy, ...)
- Assignment due: memory-efficient SGD
- New assignment: mini-project proposals (first draft).
- Thus Mar 8. Guest Lecture: Joey Gonzales, CMU, GraphLab and Dynamic Asynchronous Computation
- Tues Mar 13. no class - spring break.
- Thus Mar 15. no class - spring break.
- Tues Mar 20. Spectral clustering and PIC.
- Assignment due: mini-project proposals (first draft).
- New Assignment: Subsampling and visualizing a graph.
- Thus Mar 22. Tentative: Guest lecture by U Kang, CMU.
- Tues Mar 27. Gibbs sampling and LDA.
- Assignment due: Subsampling and visualizing a graph.
- New Assignment: mini-project proposals (final version)
- Thus Mar 29. KNN classification and inverted indices.
- Assignment due: mini-project proposals (final version).
April
- Tues Apr 3. Decision trees and random forests 1.
- Thus Apr 5. Decision trees and random forests 2.
- Tues Apr 10. Soft joins with KNN and inverted indices 1.
- Thus Apr 12. Soft joins with KNN and inverted indices 1.
- Tues Apr 17. Structured prediction 1.
- Thus Apr 19. no class - Carnival
- Tues Apr 24. Structured prediction 2.
- Thus Apr 26. Additional topics.
May
- Tues May 1. Project reports.
- Thus May 3. Project reports.