Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016"

From Cohen Courses
Jump to navigationJump to search
Line 8: Line 8:
 
Schedule:
 
Schedule:
  
* Thurs Sep 1, 2016 [[Class meeting for 10-605 Overview|Overview]] Grading policies and etc, History of Big Data, Complexity theory and cost of important operations
+
* Thurs Sep 1, 2016 [[Class meeting for 10-605 Overview|Overview]]Grading policies and etc, History of Big Data, Complexity theory and cost of important operations
* Tues Sep 6, 2016 [[Class meeting for 10-605 Probability Review|Probability Review]] Counting for big data and density estimation, streaming Naive Bayes, Rocchio and TFIDF
+
* Tues Sep 6, 2016 [[Class meeting for 10-605 Probability Review|Probability Review]]Counting for big data and density estimation, streaming Naive Bayes, Rocchio and TFIDF
* Thurs Sep 8, 2016 [[Class meeting for 10-605 Streaming Naive Bayes|Streaming Naive Bayes]] Notes on scalable naive bayes, Local counting in stream and sort
+
* Thurs Sep 8, 2016 [[Class meeting for 10-605 Streaming Naive Bayes|Streaming Naive Bayes]]Notes on scalable naive bayes, Local counting in stream and sort
** '''Start work on''' assignment 1a: streaming NB
+
** '''Start work on''' Assignment 1a: Streaming NB. Draft at http://www.cs.cmu.edu/~wcohen/10-605/assignments/hashtable-nb.pdf
* Tues Sep 13, 2016 [[Class meeting for 10-605 Hadoop Overview|Hadoop Overview]] Intro to Hadoop, Hadoop Streaming
+
* Tues Sep 13, 2016 [[Class meeting for 10-605 Hadoop Overview|Hadoop Overview]]Intro to Hadoop, Hadoop Streaming
** '''Start work on'''  assignment 1b: streaming NB on streaming hadoop
+
** '''Start work on'''  Assignment 1b: Streaming NB on Hadoop. Draft at http://www.cs.cmu.edu/~wcohen/10-605/assignments/stream-nb.pdf, https://drive.google.com/file/d/0BzQQ-spWKjhUd0NXSTB6TW82LWM/view
* Thurs Sep 15, 2016 [[Class meeting for 10-605 Workflows For Hadoop|Workflows For Hadoop 1]] Scalable classification, Scalable Rocchio and TFIDF, Abstracts for map-reduce algorithms, Joins in Hadoop, TFIDF in Pig, Guinea Pig intro, TFIDF in Guinea Pig
+
* Thurs Sep 15, 2016 [[Class meeting for 10-605 Workflows For Hadoop|Workflows For Hadoop 1]]Scalable classification, Scalable Rocchio and TFIDF, Abstracts for map-reduce algorithms, Joins in Hadoop, TFIDF in Pig, Guinea Pig intro, TFIDF in Guinea Pig
* Tues Sep 20, 2016 [[Class meeting for 10-605 Workflows For Hadoop|Workflows For Hadoop 2]] Similarity joins, Similarity joins with TFIDF, Parallel simjoins, PageRank in Pig, K-means in Pig, Spark, Systems built on top of Hadoop
+
* Tues Sep 20, 2016 [[Class meeting for 10-605 Workflows For Hadoop|Workflows For Hadoop 2]]Similarity joins, Similarity joins with TFIDF, Parallel simjoins, PageRank in Pig, K-means in Pig, Spark, Systems built on top of Hadoop
** '''Start work on''' assignment 2: naive bayes testing in guinea pig
+
** '''Start work on''' Assignment 2: Naive bayes testing in Guinea Pig, draft at https://drive.google.com/file/d/0B-p8_eIVeEHFM1JOSGFWNFFJcU0/view
* Thurs Sep 22, 2016 [[Class meeting for 10-605 Phrase Finding|Phrase Finding]] Phrase-finding in Pig, Other work with phrases
+
* Thurs Sep 22, 2016 [[Class meeting for 10-605 Phrase Finding|Phrase Finding]]Phrase-finding in Pig, Other work with phrases
* Tues Sep 27, 2016 [[Class meeting for 10-605 SGD and Hash Kernels|SGD and Hash Kernels]] Learning as optimization, Logistic regression with SGD, Regularized SGD, Hash kernels for logistic regression
+
* Tues Sep 27, 2016 [[Class meeting for 10-605 SGD and Hash Kernels|SGD and Hash Kernels]]Learning as optimization, Logistic regression with SGD, Regularized SGD, Hash kernels for logistic regression
* Thurs Sep 29, 2016 [[Class meeting for 10-605 Parallel Perceptrons|Parallel Perceptrons 1]] Debugging ML algorithms
+
* Thurs Sep 29, 2016 [[Class meeting for 10-605 Parallel Perceptrons|Parallel Perceptrons 1]]Debugging ML algorithms
** '''Start work on''' assignment 3: scalable sgd system
+
** '''Start work on''' Assignment 3: scalable SGD Draft at http://curtis.ml.cmu.edu/w/courses/images/8/86/Sgd_fall15.pdf
* Thurs Oct 6, 2016 [[Class meeting for 10-605 Parallel Perceptrons|Parallel Perceptrons 2]] Structured perceptrons, Interative parameter mixing paper
+
* Thurs Oct 6, 2016 [[Class meeting for 10-605 Parallel Perceptrons|Parallel Perceptrons 2]]Structured perceptrons, Interative parameter mixing paper
* Tues Oct 11, 2016 [[Class meeting for 10-605 SGD for MF|SGD for MF]] Matrix factorization, Matrix factorization with SGD, distributed matrix factorization with SGD
+
* Tues Oct 11, 2016 [[Class meeting for 10-605 SGD for MF|SGD for MF]]Matrix factorization, Matrix factorization with SGD, distributed matrix factorization with SGD
* Thurs Oct 13, 2016 [[Class meeting for 10-605 Midterm review|Midterm review]]  
+
* Thurs Oct 13, 2016 [[Class meeting for 10-605 Midterm review|Midterm review]]
 
** '''Last assignment due'''
 
** '''Last assignment due'''
* Tues Oct 18, 2016 [[Class meeting for 10-605 Midterm|Midterm]]  
+
* Tues Oct 18, 2016 [[Class meeting for 10-605 Midterm|Midterm]]
* Thurs Oct 20, 2016 [[Class meeting for 10-605 Subsampling a Graph|Subsampling a Graph]] Sampling a graph, Local partitioning
+
* Thurs Oct 20, 2016 [[Class meeting for 10-605 Subsampling a Graph|Subsampling a Graph]]Sampling a graph, Local partitioning
** '''Start work on''' assignment 4: graph subsampling
+
** '''Start work on''' Assignment 4: Subsampling a Graph with Approximate PageRank, draft at https://drive.google.com/file/d/0BzQQ-spWKjhUaWoyOFZHV21uUlU/view
* Tues Oct 25, 2016 [[Class meeting for 10-605 Deep Learning|Deep Learning 1]] Deep learning intro, Deep learning and GPUs, Expressiveness of MLPs, Exploding and vanishing gradients, Modern deep learning models
+
* Tues Oct 25, 2016 [[Class meeting for 10-605 Deep Learning|Deep Learning 1]]Deep learning intro, Deep learning and GPUs, Expressiveness of MLPs, Exploding and vanishing gradients, Modern deep learning models
* Thurs Oct 27, 2016 [[Class meeting for 10-605 Deep Learning|Deep Learning 2]] Reverse-mode differentiation, Recursive ANNs, Word2vec
+
* Thurs Oct 27, 2016 [[Class meeting for 10-605 Deep Learning|Deep Learning 2]]Reverse-mode differentiation, Recursive ANNs, Word2vec
* Tues Nov 1, 2016 [[Class meeting for 10-605 Randomized Algorithms|Randomized Algorithms 1]] Bloom filters, The countmin sketch
+
* Tues Nov 1, 2016 [[Class meeting for 10-605 Randomized Algorithms|Randomized Algorithms 1]]Bloom filters, The countmin sketch
** '''Start work on''' assignment 5: autodiff with IPM
+
** '''Start work on''' Assignment 5: Autodiff with IPM.  This is a new assignment for Fall 2016.
* Thurs Nov 3, 2016 [[Class meeting for 10-605 Randomized Algorithms|Randomized Algorithms 2]] Locality sensitive hashing
+
* Thurs Nov 3, 2016 [[Class meeting for 10-605 Randomized Algorithms|Randomized Algorithms 2]]Locality sensitive hashing
* Tues Nov 8, 2016 [[Class meeting for 10-605 Graph Architectures for ML|Graph Architectures for ML]] Graph-based ML architectures, Pregel, Signal-collect, GraphLab, PowerGraph, GraphChi, GraphX
+
* Tues Nov 8, 2016 [[Class meeting for 10-605 Graph Architectures for ML|Graph Architectures for ML]]Graph-based ML architectures, Pregel, Signal-collect, GraphLab, PowerGraph, GraphChi, GraphX
* Thurs Nov 10, 2016 [[Class meeting for 10-605 SSL on Graphs|SSL on Graphs]] Semi-supervised learning intro, Multirank-walk SSL method, Harmonic fields, Modified Adsorption SSL method, MAD with countmin sketches
+
* Thurs Nov 10, 2016 [[Class meeting for 10-605 SSL on Graphs|SSL on Graphs]]Semi-supervised learning intro, Multirank-walk SSL method, Harmonic fields, Modified Adsorption SSL method, MAD with countmin sketches
* Tues Nov 15, 2016 [[Class meeting for 10-605 Unsupervised Learning On Graphs|Unsupervised Learning On Graphs]] Spectral clustering, Power iteration clustering, Label propagation for clustering non-graph data, Label propagation for SSL on non-graph data
+
* Tues Nov 15, 2016 [[Class meeting for 10-605 Unsupervised Learning On Graphs|Unsupervised Learning On Graphs]]Spectral clustering, Power iteration clustering, Label propagation for clustering non-graph data, Label propagation for SSL on non-graph data
** '''Start work on''' assignment 6: graphX for SSL
+
** '''Start work on''' Assignment 6: To be decided, possibly using Spark/GraphX to do PIC or MRW.
* Thurs Nov 17, 2016 [[Class meeting for 10-605 Parameter Servers|Parameter Servers]]  
+
* Thurs Nov 17, 2016 [[Class meeting for 10-605 Parameter Servers|Parameter Servers]]
* Tues Nov 22, 2016 [[Class meeting for 10-605 LDA|LDA 1]] DGMs for naive Bayes, Gibbs sampling for LDA
+
* Tues Nov 22, 2016 [[Class meeting for 10-605 LDA|LDA 1]]DGMs for naive Bayes, Gibbs sampling for LDA
** '''Start work on''' assignment 7: LDA with parameter servers
+
** '''Start work on''' Assignment 7: LDA with a Parameter Server, draft http://curtis.ml.cmu.edu/w/courses/images/1/16/Hw7-lda-ps.pdf
* Tues Nov 29, 2016 [[Class meeting for 10-605 LDA|LDA 2]] Parallelizing LDA, Fast sampling for LDA, DGMs for graphs
+
* Tues Nov 29, 2016 [[Class meeting for 10-605 LDA|LDA 2]]Parallelizing LDA, Fast sampling for LDA, DGMs for graphs
* Thurs Dec 1, 2016 [[Class meeting for 10-605 Scalable Probabilistic Logics|Scalable Probabilistic Logics]]  
+
* Thurs Dec 1, 2016 [[Class meeting for 10-605 Scalable Probabilistic Logics|Scalable Probabilistic Logics]]
* Tues Dec 6, 2016 [[Class meeting for 10-605 Review session for final|Review session for final]]  
+
* Tues Dec 6, 2016 [[Class meeting for 10-605 Review session for final|Review session for final]]
 
** '''Last assignment due'''
 
** '''Last assignment due'''
* Thurs Dec 8, 2016 [[Class meeting for 10-605 Final Exam|Final Exam]]
+
* Thurs Dec 8, 2016 [[Class meeting for 10-605 Final Exam|Final Exam]].

Revision as of 17:29, 11 August 2016

This is the syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016.

Notes:

  • Homeworks, unless otherwise posted, will be due when the next HW comes out.
  • Lecture notes and/or slides will be (re)posted around the time of the lectures.
  • No classes will be held on Oct 4 (Rosh Hashana) or Nov 24 (Thanksgiving)

Schedule: