Difference between revisions of "Class meeting for 10-605 SGD and Hash Kernels"
From Cohen Courses
Jump to navigationJump to search (→Slides) |
(→Slides) |
||
Line 7: | Line 7: | ||
* [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pptx Slides in Powerpoint] | * [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pptx Slides in Powerpoint] | ||
* [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pdf Slides in PDF] | * [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pdf Slides in PDF] | ||
+ | |||
+ | === Today's Quiz === | ||
+ | |||
+ | |||
+ | https://qna-app.appspot.com/view.html?aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgICg7MHcCgw | ||
=== Readings for the Class === | === Readings for the Class === |
Revision as of 15:36, 1 October 2015
This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2015.
Slides
Stochastic gradient descent:
Today's Quiz
https://qna-app.appspot.com/view.html?aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgICg7MHcCgw
Readings for the Class
Optional readings
- For logistic regression, and the sparse updates for it: Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression, Carpenter, Bob. 2008. See also his blog post on logistic regression. I also recommend Charles Elkan's notes on logistic regression (local saved copy).
- For hash kernels: Feature Hashing for Large Scale Multitask Learning, Weinberger et al, ICML 2009.