Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2013

From Cohen Courses
Revision as of 16:59, 19 April 2013 by Yanbox (talk | contribs)
Jump to navigationJump to search

This is the syllabus for Machine Learning with Large Datasets 10-605 in Spring 2013.

January

February

March

April and May

  • Mon Apr 1. Speeding up LDA-like models: sparse sampling and parallelization
  • Wed Apr 3. Speeding up LDA-like models: All-reduce and online LDA
    • Assignment due: Subsampling and visualizing a graph.
    • New Assignment: K-Means on MapReduce. PDF writeup
  • Mon Apr 8. Fast KNN and similarity joins 1.
  • Wed Apr 10. Fast KNN and similarity joins 2.
  • Mon Apr 15. Scaling up decision tree learning
    • Project progress report due
  • Wed Apr 17. Gradient boosting with trees, and SGD for matrix factorization
  • Mon Apr 22. Guest lecture, Evangelos Papalexakis, on Scalable Tensor Methods.
  • Wed Apr 24. Project reports. Please upload your slides to Blackboard in advance by *1:00pm*
    • Team1: Namit Shetty, Namit Katariya
    • Team2: Jieru Shi, Luzheng Sheng
    • Team3: Edward Zhang, Weihua Cao, Yue Ma
    • Team4: Yibin Lin, Yu Gong
    • Team5: Sukhada Palkar
    • Team6: Han Yang, Qiangjian Xi
    • Team7: Russell Cullen, Jonathan Hsu
  • Mon Apr 29. Project reports. Please upload your slides to Blackboard in advance by *1:00pm*
    • Team8: Andrea Klein, Dipan Pal
    • Team9: Zeyuan Li, Pengqi Liu, Fei Xie
    • Team10: Yiwen Chen, Zhiqi Li, Yuliang Yin
    • Team11: Ye Zhang, Hao Chen, Qi Wang
    • Team12: Chunlei Liu, Zhen Tang
    • Team13: Zaid Sheikh, Shourabh Rawat, Sushant Kumar
    • Team14: Huanchen Zhang, Mengwei Ding
  • Wed May 1. Project reports. Please upload your slides to Blackboard in advance by *1:00pm*
    • Team15: Shu-Hao Yu, Guanyu Wang, Mayank Mohta
    • Team16: Li Lu, Chun Chen, Yuchen Tian
    • Team17: Shannon Quinn
    • Team18: Avesh Singh, Adam Mihalcin
    • Team19: Yubin Kim, Juan Manuel Caicedo Carvajal
    • Team20: Yue Yu, Jie Dai, Mayank Ketkari
    • Team21: Varuni Gang, Alkeshkumar Patel
    • Assignment due: Multi-class image classification or scalable classification.

May

  • 9am, Tuesday, May 7. Project writeups due. Submit a paper to Blackbook in PDF in the ICML 2013 format (minimum 5 pp, up to 8pp double column), except, of course, do not submit it anonymously.
    • Note: this is extended from previous deadline of Fri May 3---but I can't give any further extensions! Your project report should discuss
      • The problem you're trying to solve, and why it's important and/or interesting.
      • Related work, especially any related work that you're building on.
      • The data that you're working with.
      • The methods that you're using (in some detail - even if these are off-the-shelf methods, I want to know that you understand them)
      • The experiments you did, the metrics you used to evaluate them, and the results.
      • What was learned from the experiments (the conclusions).
    • You should think of this as an exercise in writing a conference-style paper: so try and write in that style. (Of course, your work doesn't need to advance the state-of-the-art in machine learning, or be highly novel, but it should be well-described.)