Difference between revisions of "Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2012"

From Cohen Courses
Jump to navigationJump to search
Line 6: Line 6:
 
* Streaming Learning algorithms [1.5 weeks]
 
* Streaming Learning algorithms [1.5 weeks]
 
** Lecture: Naive Bayes, and a streaming implementation of it (features in memory).
 
** Lecture: Naive Bayes, and a streaming implementation of it (features in memory).
** '''Assignment: streaming Naive Bayes w/ features in memory'''
+
*** '''Assignment: streaming Naive Bayes w/ features in memory'''
 
** Lecture: Naive Bayes and logistic regression.
 
** Lecture: Naive Bayes and logistic regression.
 
** Lecture: SGD implementation of LogReg, with lazy regularization
 
** Lecture: SGD implementation of LogReg, with lazy regularization
** '''Assignment: streaming LogReg w/ features in memory'''
+
*** '''Assignment: streaming LogReg w/ features in memory'''
 
* Stream-and-sort [1.5 week]
 
* Stream-and-sort [1.5 week]
 
** Lecture: Naive Bayes when data's not in memory.
 
** Lecture: Naive Bayes when data's not in memory.
 
** Lecture: finding informative phrases (with vocab counts in memory).
 
** Lecture: finding informative phrases (with vocab counts in memory).
** '''Assignment: stream-and-sort Naive Bayes'''
+
*** '''Assignment: stream-and-sort Naive Bayes'''
 
** Lecture: messages and records; revisit finding informative phrases.
 
** Lecture: messages and records; revisit finding informative phrases.
** '''Assignment: informative phrases'''
+
*** '''Assignment: finding informative phrases'''
 
* Map-reduce and Hadoop [1 week]
 
* Map-reduce and Hadoop [1 week]
 
** Lecture: Alona, using Hadoop
 
** Lecture: Alona, using Hadoop

Revision as of 16:58, 20 October 2011

Schedule

  • Overviews [1 week]
    • Lecture: Overview of course, cost of various operations, asymptotic analysis
    • Lecture: Review of probabilities
  • Streaming Learning algorithms [1.5 weeks]
    • Lecture: Naive Bayes, and a streaming implementation of it (features in memory).
      • Assignment: streaming Naive Bayes w/ features in memory
    • Lecture: Naive Bayes and logistic regression.
    • Lecture: SGD implementation of LogReg, with lazy regularization
      • Assignment: streaming LogReg w/ features in memory
  • Stream-and-sort [1.5 week]
    • Lecture: Naive Bayes when data's not in memory.
    • Lecture: finding informative phrases (with vocab counts in memory).
      • Assignment: stream-and-sort Naive Bayes
    • Lecture: messages and records; revisit finding informative phrases.
      • Assignment: finding informative phrases
  • Map-reduce and Hadoop [1 week]
    • Lecture: Alona, using Hadoop
    • Lecture: Alona, programming tips
  • Reducing memory usage with randomized methods.
    • Lecture: Locality-sensitive hashing.
    • Lecture: Bloom filters for counting events.
    • Lecture: Feature hashing and Vowpal Wabbit.

Planned Topics

Draft - subject to change!

  • Week 1. Overview of course, and overview lecture on probabilities.
  • Week 2. Streaming learning algorithms.
    • Naive Bayes for discrete data.
    • A streaming-data implementation of Naive Bayes.
    • A streaming-data implementation of Naive Bayes assuming a larger-than-memory feature set, by using the 'stream and sort' pattern.
    • Discussion of other streaming learning methods.
      • Rocchio
      • Perceptron-style algorithms
      • Streaming regression?
    • Assignment: two implementations of Naive Bayes, one with feature-weights in memory, one purely streaming.
  • Week 3. Examples of more complex programs using stream-and-sort.
    • Lecture topics:
      • Finding informative phrases in a corpus, and finding polar phrases in a corpus.
      • Using records and messages to manage a complex dataflow.
    • Assignment: phrase-finding and sentiment classification
  • Week 4. The map-reduce paradigm and Hadoop.
    • Assignment: Hadoop re-implementation of assignments 1/2.
  • Week 5. Reducing memory usage with randomized methods.
    • Feature hashing and Vowpal Wabbit.
    • Bloom filters for counting events.
    • Locality-sensitive hashing.
    • Assignment: memory-efficient Naive Bayes.
  • Week 6-7. Nearest-neighbor finding and bulk classification.
    • Using a search engine to find approximate nearest neighbors.
    • Using inverted indices to find approximate nearest neighbors or to perform bulk linear classification.
    • Implementing soft joins using map-reduce and nearest-neighbor methods.
    • The local k-NN graph for a dataset.
    • Assignment: Tool for approximate k-NN graph for a large dataset.
  • Week 8-10. Working with large graphs.
    • PageRank and RWR/PPR.
    • Special issues involved with iterative processing on graphs in Map-Reduce: the schimmy pattern.
      • Formalisms/environments for iterative processing on graphs: GraphLab, Sparks, Pregel.
    • Extracting small graphs from a large one:
      • LocalSpectral - finding the meaningful neighborhood of a query node in a large graph.
      • Visualizing graphs.
    • Semi-supervised classification on graphs.
    • Clustering and community-finding in graphs.
    • Assignments: Snowball sampling a graph with LocalSpectral and visualizing the results.
  • Week 11. Stochastic gradient descent and other streaming learning algorithms.
    • SGD for logistic regression.
    • Large feature sets SGD: delayed regularization-based updates; projection onto L1; truncated gradients.
    • Assignment: Proposal for a one-month project.
  • Weeks 12-15. Additional topics.
    • Scalable k-means clustering.
    • Gibbs sampling and streaming LDA.
    • Stacking and cascaded learning approaches.
    • Decision tree learning for large datasets.
    • Assignment: Writeup of project results.