Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2014"

From Cohen Courses
Jump to navigationJump to search
Line 49: Line 49:
 
== April and May ==
 
== April and May ==
  
* Mon Apr 1. [[Class meeting for 10-605 2013 04 01|Speeding up LDA-like models: sparse sampling and parallelization]]
+
* Wed Apr 1. [[Class meeting for 10-605 2013 LDA 2|Speeding up LDA-like models: All-reduce and online LDA]]
* Wed Apr 3. [[Class meeting for 10-605 2013 04 03|Speeding up LDA-like models: All-reduce and online LDA]]
 
 
** '''Assignment due: Subsampling and visualizing a graph.'''
 
** '''Assignment due: Subsampling and visualizing a graph.'''
 
** ''New Assignment: K-Means on MapReduce.'' [http://www.cs.cmu.edu/~wcohen/10-605/assignments/kmeans.pdf PDF writeup]
 
** ''New Assignment: K-Means on MapReduce.'' [http://www.cs.cmu.edu/~wcohen/10-605/assignments/kmeans.pdf PDF writeup]
* Mon Apr 8. [[Class meeting for 10-605 2013 04 08|Fast KNN and similarity joins 1.]]
+
* Mon Apr 7. [[Class meeting for 10-605 2013 04 08|Fast KNN and similarity joins 1.]]
* Wed Apr 10. [[Class meeting for 10-605 2013 04 10|Fast KNN and similarity joins 2.]]
+
* Wed Apr 9. [[Class meeting for 10-605 2013 04 10|Fast KNN and similarity joins 2.]]
* Mon Apr 15. [[Class meeting for 10-605 2013 04 15|Scaling up decision tree learning]]
+
* Mon Apr 14. [[Class meeting for 10-605 2013 04 15|Scaling up decision tree learning]]
 
** '''Project progress report due'''
 
** '''Project progress report due'''
* Wed Apr 17.  [[Class meeting for 10-605 2013 04 17|Gradient boosting with trees, and SGD for matrix factorization]]
+
* Wed Apr 16.  [[Class meeting for 10-605 2013 04 17|Gradient boosting with trees, and SGD for matrix factorization]]
 
** '''Assignment due: K-Means on MapReduce.'''
 
** '''Assignment due: K-Means on MapReduce.'''
 
** ''New Assignment: Multi-class image classification or scalable classification using a linear classifier.''  Both of these count as one assignment toward your six.
 
** ''New Assignment: Multi-class image classification or scalable classification using a linear classifier.''  Both of these count as one assignment toward your six.
 
*** [http://www.cs.cmu.edu/~wcohen/10-605/assignments/image.pdf PDF writeup of image-classification assignment]
 
*** [http://www.cs.cmu.edu/~wcohen/10-605/assignments/image.pdf PDF writeup of image-classification assignment]
 
*** [http://www.cs.cmu.edu/~wcohen/10-605/assignments/big-classifier.pdf PDF writeup of scalable classification]
 
*** [http://www.cs.cmu.edu/~wcohen/10-605/assignments/big-classifier.pdf PDF writeup of scalable classification]
* Mon Apr 22. ''Guest lecture, Evangelos Papalexakis, on Scalable Tensor Methods.''
+
* Mon Apr 21. TBD
Project reports: '''Please upload your slides to Blackboard before the class, by *1:00pm*'''
+
* Wed Apr 23TBD
* Wed Apr 24Project reports.
+
* Mon Apr 28. TBD
** Team1: Namit Shetty, Namit Katariya
+
* Wed Apr 30. In-class exam.
** Team2: Jieru Shi, Luzheng Sheng
 
** Team3: Edward Zhang, Weihua Cao, Yue Ma
 
** Team4: Yibin Lin, Yu Gong
 
** Team5: Sukhada Palkar
 
** Team6: Han Yang, Qiangjian Xi
 
** Team7: Russell Cullen, Jonathan Hsu
 
* Mon Apr 29. Project reports.  
 
** Team8: Andrea Klein, Dipan Pal
 
** Team9: Zeyuan Li, Pengqi Liu, Fei Xie
 
** Team10: Yiwen Chen, Zhiqi Li, Yuliang Yin
 
** Team11: Ye Zhang, Hao Chen, Qi Wang
 
** Team12: Chunlei Liu, Zhen Tang
 
** Team13: Zaid Sheikh, Shourabh Rawat, Sushant Kumar
 
** Team14: Huanchen Zhang, Mengwei Ding
 
* Wed May 1. Project reports.  
 
** Team15: Shu-Hao Yu, Guanyu Wang, Mayank Mohta
 
** Team16: Li Lu, Chun Chen, Yuchen Tian
 
** Team17: Shannon Quinn
 
** Team18: Avesh Singh, Adam Mihalcin
 
** Team19: Yubin Kim, Juan Manuel Caicedo Carvajal
 
** Team20: Yue Yu, Jie Dai, Mayank Ketkari
 
** Team21: Varuni Gang, Alkeshkumar Patel
 
 
** '''Assignment due: Multi-class image classification or scalable classification.'''
 
** '''Assignment due: Multi-class image classification or scalable classification.'''
  

Revision as of 18:07, 8 January 2014

This is the syllabus for Machine Learning with Large Datasets 10-605 in Spring 2014.

January

February

March

April and May

May

  • 9am, Tuesday, May 7. Project writeups due. Submit a paper to Blackbook in PDF in the ICML 2013 format (minimum 5 pp, up to 8pp double column), except, of course, do not submit it anonymously.
    • Note: this is extended from previous deadline of Fri May 3---but I can't give any further extensions! Your project report should discuss
      • The problem you're trying to solve, and why it's important and/or interesting.
      • Related work, especially any related work that you're building on.
      • The data that you're working with.
      • The methods that you're using (in some detail - even if these are off-the-shelf methods, I want to know that you understand them)
      • The experiments you did, the metrics you used to evaluate them, and the results.
      • What was learned from the experiments (the conclusions).
    • You should think of this as an exercise in writing a conference-style paper: so try and write in that style. (Of course, your work doesn't need to advance the state-of-the-art in machine learning, or be highly novel, but it should be well-described.)