Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2015"

From Cohen Courses
Jump to navigationJump to search
Line 15: Line 15:
 
* Tues Jan 27. [[Class meeting for 10-605 Rocchio and On-line Learning|Phrase Finding and Rocchio]]
 
* Tues Jan 27. [[Class meeting for 10-605 Rocchio and On-line Learning|Phrase Finding and Rocchio]]
 
** '''HW1A and HW1B due.'''
 
** '''HW1A and HW1B due.'''
 +
** ''HW2: phrase finding with stream-and-sort''. [http://curtis.ml.cmu.edu/w/courses/images/5/5e/Phrases.pdf PDF Handout]
 
* Thus Jan 29. [[Class meeting for 10-605 Parallel Perceptrons|Rocchio and Parallel Perceptrons]]
 
* Thus Jan 29. [[Class meeting for 10-605 Parallel Perceptrons|Rocchio and Parallel Perceptrons]]
  
Line 20: Line 21:
  
 
* Tues Feb 3. [[Class meeting for 10-605 Hadoop 1|Perceptrons/Map-reduce and Hadoop]].
 
* Tues Feb 3. [[Class meeting for 10-605 Hadoop 1|Perceptrons/Map-reduce and Hadoop]].
** '''Assignment due: streaming Naive Bayes 2 (with feature counts on disk) with stream-and-sort'''
 
** ''New Assignment: phrase finding with stream-and-sort''. [http://curtis.ml.cmu.edu/w/courses/images/5/5e/Phrases.pdf PDF Handout]
 
 
* Thus Feb 5.  [[Class meeting for 10-605 Parallel Perceptrons 2|Parallel Perceptrons]].
 
* Thus Feb 5.  [[Class meeting for 10-605 Parallel Perceptrons 2|Parallel Perceptrons]].
 
* Tues Feb 10. '''student presentations'''
 
* Tues Feb 10. '''student presentations'''
 +
** '''Assignment due: phrase finding with stream-and-sort'''
 +
** ''HW3,4: Naive Bayes with Streaming Hadoop,  Naive Bayes with Hadoop & Phrase-finding with Hadoop''. [http://curtis.ml.cmu.edu/w/courses/images/c/c0/Homework4a.pdf PDF Handout (4a) HW4 - warmup]
 +
[http://curtis.ml.cmu.edu/w/courses/images/a/a2/Homework4b.pdf PDF Handout (4b) HW4]
 +
[http://curtis.ml.cmu.edu/w/courses/images/3/30/Homework4c.pdf PDF Handout (4c) HW5]
 
* Thus Feb 12. '''student presentations'''
 
* Thus Feb 12. '''student presentations'''
 
* Tues Feb 17. [[Class meeting for 10-605 SGD and Hash Kernels|Scalable SGD and Hash Kernels]]
 
* Tues Feb 17. [[Class meeting for 10-605 SGD and Hash Kernels|Scalable SGD and Hash Kernels]]
** '''Assignment due: phrase finding with stream-and-sort'''
+
** '''HW3 due. (Naive Bayes with Hadoop)'''
** ''New Assignments: Naive Bayes with Streaming Hadoop,  Naive Bayes with Hadoop & Phrase-finding with Hadoop''. [http://curtis.ml.cmu.edu/w/courses/images/c/c0/Homework4a.pdf PDF Handout (4a)][http://curtis.ml.cmu.edu/w/courses/images/a/a2/Homework4b.pdf PDF Handout (4b)][http://curtis.ml.cmu.edu/w/courses/images/3/30/Homework4c.pdf PDF Handout (4c)]
 
 
* Thus Feb 19. [[Class meeting for 10-605 SGD for MF|Matrix Factorization and SGD, plus another Hadoop demo]]
 
* Thus Feb 19. [[Class meeting for 10-605 SGD for MF|Matrix Factorization and SGD, plus another Hadoop demo]]
 
* Tues Feb 24. [[Class meeting for 10-605 SGD for MF 2 and Randomized Algorithms|SGD for Matrix Factorization, and Randomized Algorithms 1 (Bloom Filters)]]
 
* Tues Feb 24. [[Class meeting for 10-605 SGD for MF 2 and Randomized Algorithms|SGD for Matrix Factorization, and Randomized Algorithms 1 (Bloom Filters)]]
** '''Streaming run on Hadoop of Naive Bayes due'''
 
 
* Thus Feb 26. [[Class meeting for 10-605 Graphs 2|Randomized Algorithms]]
 
* Thus Feb 26. [[Class meeting for 10-605 Graphs 2|Randomized Algorithms]]
** '''Non-streaming run on Hadoop of Naive Bayes due.'''
+
 
  
 
== March  ==
 
== March  ==
Line 38: Line 39:
 
* Tues Mar 3. '''student presentations'''
 
* Tues Mar 3. '''student presentations'''
 
* Thus Mar 5. '''student presentations'''
 
* Thus Mar 5. '''student presentations'''
** '''Hadoop assignment (phrase-finding) due'''
+
** '''HW4 due. (Phrase-finding with Hadoop)'''
* Tues Mar 10. ''no class - spring break.''
+
** ''HW5: memory-efficient SGD'' [http://curtis.ml.cmu.edu/w/courses/images/0/08/Sgd.pdf PDF handout]
 +
* Tues Mar 10. ''no class - spring break.''
 
* Thus Mar 12. ''no class - spring break.''
 
* Thus Mar 12. ''no class - spring break.''
 
* Tues Mar 17. [[Class meeting for 10-605 Subsample A Graph|Scalable PageRank]]
 
* Tues Mar 17. [[Class meeting for 10-605 Subsample A Graph|Scalable PageRank]]
** ''New Assignment: memory-efficient SGD'' [http://curtis.ml.cmu.edu/w/courses/images/0/08/Sgd.pdf PDF handout]
+
** HW5 due: memory-efficient SGD  
 +
** ''HW6: Subsampling and visualizing a graph.'' [http://curtis.ml.cmu.edu/w/courses/images/e/eb/ApproxPageRank.pdf PDF handout]
 
* Thus Mar 19. [[Class meeting for 10-605 Subsampling Graphs|Subsampling a graph with RWR]]
 
* Thus Mar 19. [[Class meeting for 10-605 Subsampling Graphs|Subsampling a graph with RWR]]
 
* Tues Mar 24. [[Class meeting for 10-605 SSL on Graphs|Subsamping continued and SSL on Graphs]]  '''AAAI Spring Symposium week'''
 
* Tues Mar 24. [[Class meeting for 10-605 SSL on Graphs|Subsamping continued and SSL on Graphs]]  '''AAAI Spring Symposium week'''
 
* Thus Mar 26. [[Class meeting for 10-605 Spectral Clustering|Scalable spectral clustering techniques.]] '''AAAI Spring Symposium week'''
 
* Thus Mar 26. [[Class meeting for 10-605 Spectral Clustering|Scalable spectral clustering techniques.]] '''AAAI Spring Symposium week'''
** Assignment due: memory-efficient SGD
+
 
 
* Tues Mar 31. [[Class meeting for 10-605 LDA 1|Sparse sampling and parallelization for LDA]]
 
* Tues Mar 31. [[Class meeting for 10-605 LDA 1|Sparse sampling and parallelization for LDA]]
** '''Assignment due: memory-efficient SGD'''
+
 
** ''New Assignment: Subsampling and visualizing a graph.'' [http://curtis.ml.cmu.edu/w/courses/images/e/eb/ApproxPageRank.pdf PDF handout]
 
  
 
== April  ==
 
== April  ==

Revision as of 16:46, 5 January 2015

This is the syllabus for Machine Learning with Large Datasets 10-605 in Spring 2015.

Notes:

  • The assignments are from 2014, and will be modified over the course of the semester - some may be changed substantially.
  • Lecture notes and/or slides will be posted around the time of the lectures.

January

February

PDF Handout (4b) HW4 PDF Handout (4c) HW5


March


April

Topics covered in previous years but not in 2015