Difference between revisions of "Class meeting for 10-605 SGD and Hash Kernels"

From Cohen Courses
Jump to navigationJump to search
Line 21: Line 21:
 
* For logistic regression, and the sparse updates for it:  [http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression], Carpenter, Bob. 2008. See also [http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html his blog post] on logistic regression.  I also recommend [http://www.cs.cmu.edu/~wcohen/10-605/notes/elkan-logreg.pdf Charles Elkan's notes on logistic regression] (local saved copy).
 
* For logistic regression, and the sparse updates for it:  [http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression], Carpenter, Bob. 2008. See also [http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html his blog post] on logistic regression.  I also recommend [http://www.cs.cmu.edu/~wcohen/10-605/notes/elkan-logreg.pdf Charles Elkan's notes on logistic regression] (local saved copy).
 
* For hash kernels: [http://arxiv.org/pdf/0902.2206.pdf Feature Hashing for Large Scale Multitask Learning], Weinberger et al, ICML 2009.
 
* For hash kernels: [http://arxiv.org/pdf/0902.2206.pdf Feature Hashing for Large Scale Multitask Learning], Weinberger et al, ICML 2009.
 +
 +
=== Things to Remember ===
 +
 +
 +
* Approach of learning by optimization
 +
* Optimization goal for logistic regression
 +
* Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
 +
* Updates for logistic regression, with and without regularization
 +
* Formalization of logistic regression as matching expectations between data and model
 +
* Regularization and how it interacts with overfitting
 +
* How "sparsifying" regularization affects run-time and memory
 +
* What the "hash trick" is and why it should work

Revision as of 10:08, 16 October 2015

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2015.

Slides

Stochastic gradient descent:

Today's Quiz

https://qna-app.appspot.com/view.html?aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgICg7MHcCgw

Readings for the Class

Optional readings

Things to Remember

  • Approach of learning by optimization
  • Optimization goal for logistic regression
  • Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
  • Updates for logistic regression, with and without regularization
  • Formalization of logistic regression as matching expectations between data and model
  • Regularization and how it interacts with overfitting
  • How "sparsifying" regularization affects run-time and memory
  • What the "hash trick" is and why it should work