Difference between revisions of "10-601 Logistic Regression"
From Cohen Courses
Jump to navigationJump to searchLine 7: | Line 7: | ||
=== Readings === | === Readings === | ||
− | * [http://www.cs.cmu.edu/~wcohen/10-605/notes/sgd-notes.pdf William's notes on SGD (for 10605)] | + | * Optional: |
− | * [http://cseweb.ucsd.edu/~elkan/250B/logreg.pdf Charles Elkan's notes on SGD] | + | ** Bishop 4.2-4.3 |
− | * [http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression], Carpenter, Bob. 2008. See also [http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html his blog post] on logistic regression. | + | ** [http://www.cs.cmu.edu/~wcohen/10-605/notes/sgd-notes.pdf William's notes on SGD (for 10605)] |
+ | ** [http://cseweb.ucsd.edu/~elkan/250B/logreg.pdf Charles Elkan's notes on SGD] | ||
+ | ** [http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression], Carpenter, Bob. 2008. See also [http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html his blog post] on logistic regression. | ||
=== What You Should Know Afterward === | === What You Should Know Afterward === |
Revision as of 09:48, 16 September 2014
This a lecture used in the Syllabus for Machine Learning 10-601 in Fall 2014
Slides
Readings
- Optional:
- Bishop 4.2-4.3
- William's notes on SGD (for 10605)
- Charles Elkan's notes on SGD
- Lazy Sparse Stochastic Gradient Descent for Regularized Multinomial Logistic Regression, Carpenter, Bob. 2008. See also his blog post on logistic regression.
What You Should Know Afterward
- How to implement logistic regression.
- How to determine the best parameters for logistic regression models
- Why regularization matters for logistic regression.
- How logistic regression and naive Bayes are similar and different.
- The difference between a discriminative and a generative classifier.
- What "overfitting" is, and why optimizing performance on a training set does not necessarily lead to good performance on a test set.