Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2013"

From Cohen Courses
Jump to navigationJump to search
 
(3 intermediate revisions by 2 users not shown)
Line 4: Line 4:
  
 
* Mon Jan 14. [[Class meeting for 10-605 2013 01 14|Overview of course, cost of various operations, asymptotic analysis.]]
 
* Mon Jan 14. [[Class meeting for 10-605 2013 01 14|Overview of course, cost of various operations, asymptotic analysis.]]
* Wed Jan 16. [[Class meeting for 10-605 2013 01 16|Review of probabilities.]]
+
* Wed Jan 16. [[Class meeting for 10-605 2013 01 16|Review of probabilities, joint-distributions, and naive Bayes]]
* Mon Jan 21. [[Class meeting for 10-605 2013 01 21|no class - Martin Luther King Day]]
+
* Mon Jan 21. ''No class - Martin Luther King Day''
 
* Wed Jan 23. [[Class meeting for 10-605 2013 01 23|Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large feature sets.]]
 
* Wed Jan 23. [[Class meeting for 10-605 2013 01 23|Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large feature sets.]]
 
** ''New Assignment: streaming Naive Bayes 1 (with feature counts in memory)''. [http://www.cs.cmu.edu/~wcohen/10-605/assignments/hashtable-nb.pdf PDF Handout]
 
** ''New Assignment: streaming Naive Bayes 1 (with feature counts in memory)''. [http://www.cs.cmu.edu/~wcohen/10-605/assignments/hashtable-nb.pdf PDF Handout]
Line 33: Line 33:
  
 
* Mon Mar 4. [[Class meeting for 10-605 2013 03 04|Learning on graphs 2]].  
 
* Mon Mar 4. [[Class meeting for 10-605 2013 03 04|Learning on graphs 2]].  
* Wed Mar 6. ''Guest lecture: John Wong (Google): Machine Learning with Large Datasets in Google Shopping"
+
* Wed Mar 6. ''Guest lecture: John Wong (Google): Machine Learning with Large Datasets in Google Shopping''
 
** '''Hadoop assignment (phrase-finding) due'''
 
** '''Hadoop assignment (phrase-finding) due'''
 
** ''New Assignment: memory-efficient SGD'' [http://www.cs.cmu.edu/~wcohen/10-605/assignments/sgd.pdf PDF writeup]
 
** ''New Assignment: memory-efficient SGD'' [http://www.cs.cmu.edu/~wcohen/10-605/assignments/sgd.pdf PDF writeup]
Line 93: Line 93:
 
== May ==
 
== May ==
  
* 9am, Tuesday, May 7.  '''Project writeups due'''.  Submit a paper to Blackbook in PDF in the [http://icml.cc/2013/author-instructions/ ICML 2013 format] (minimum 5 pp, up to 8pp double column), except, of course, do not submit it anonymously.
+
* 9am, Tuesday, May 7.  '''Project writeups due'''.  Submit a paper to Blackbook in PDF in the [http://icml.cc/2013/wp-content/uploads/2012/12/icml2013stylefiles.tar.gz ICML 2013 format] (minimum 5 pp, up to 8pp double column), except, of course, do not submit it anonymously.
 
** ''Note: this is extended from previous deadline of Fri May 3---but I can't give any further extensions!''  Your project report should discuss
 
** ''Note: this is extended from previous deadline of Fri May 3---but I can't give any further extensions!''  Your project report should discuss
 
*** The problem you're trying to solve, and why it's important and/or interesting.
 
*** The problem you're trying to solve, and why it's important and/or interesting.

Latest revision as of 17:20, 10 January 2014

This is the syllabus for Machine Learning with Large Datasets 10-605 in Spring 2013.

January

February

March

April and May

Project reports: Please upload your slides to Blackboard before the class, by *1:00pm*

  • Wed Apr 24. Project reports.
    • Team1: Namit Shetty, Namit Katariya
    • Team2: Jieru Shi, Luzheng Sheng
    • Team3: Edward Zhang, Weihua Cao, Yue Ma
    • Team4: Yibin Lin, Yu Gong
    • Team5: Sukhada Palkar
    • Team6: Han Yang, Qiangjian Xi
    • Team7: Russell Cullen, Jonathan Hsu
  • Mon Apr 29. Project reports.
    • Team8: Andrea Klein, Dipan Pal
    • Team9: Zeyuan Li, Pengqi Liu, Fei Xie
    • Team10: Yiwen Chen, Zhiqi Li, Yuliang Yin
    • Team11: Ye Zhang, Hao Chen, Qi Wang
    • Team12: Chunlei Liu, Zhen Tang
    • Team13: Zaid Sheikh, Shourabh Rawat, Sushant Kumar
    • Team14: Huanchen Zhang, Mengwei Ding
  • Wed May 1. Project reports.
    • Team15: Shu-Hao Yu, Guanyu Wang, Mayank Mohta
    • Team16: Li Lu, Chun Chen, Yuchen Tian
    • Team17: Shannon Quinn
    • Team18: Avesh Singh, Adam Mihalcin
    • Team19: Yubin Kim, Juan Manuel Caicedo Carvajal
    • Team20: Yue Yu, Jie Dai, Mayank Ketkari
    • Team21: Varuni Gang, Alkeshkumar Patel
    • Assignment due: Multi-class image classification or scalable classification.

May

  • 9am, Tuesday, May 7. Project writeups due. Submit a paper to Blackbook in PDF in the ICML 2013 format (minimum 5 pp, up to 8pp double column), except, of course, do not submit it anonymously.
    • Note: this is extended from previous deadline of Fri May 3---but I can't give any further extensions! Your project report should discuss
      • The problem you're trying to solve, and why it's important and/or interesting.
      • Related work, especially any related work that you're building on.
      • The data that you're working with.
      • The methods that you're using (in some detail - even if these are off-the-shelf methods, I want to know that you understand them)
      • The experiments you did, the metrics you used to evaluate them, and the results.
      • What was learned from the experiments (the conclusions).
    • You should think of this as an exercise in writing a conference-style paper: so try and write in that style. (Of course, your work doesn't need to advance the state-of-the-art in machine learning, or be highly novel, but it should be well-described.)