Difference between revisions of "Class meeting for 10-405 SGD and Hash Kernels"

From Cohen Courses
Jump to navigationJump to search
(Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Data...")
 
 
(One intermediate revision by the same user not shown)
Line 5: Line 5:
 
Stochastic gradient descent:
 
Stochastic gradient descent:
  
* [http://www.cs.cmu.edu/~wcohen/10-405/2016/sgd.pptx Slides in Powerpoint]
+
* [http://www.cs.cmu.edu/~wcohen/10-405/sgd.pptx Slides in Powerpoint]
* [http://www.cs.cmu.edu/~wcohen/10-405/2016/sgd.pdf Slides in PDF]
+
* [http://www.cs.cmu.edu/~wcohen/10-405/sgd.pdf Slides in PDF]
  
 
=== Quiz ===
 
=== Quiz ===
Line 29: Line 29:
 
* Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
 
* Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
 
* Updates for logistic regression, with and without regularization
 
* Updates for logistic regression, with and without regularization
 +
* The formal  properties of sparse logistic regression
 +
** Whether it is exact or approximate
 +
** How it changes memory and time usage
 
* Formalization of logistic regression as matching expectations between data and model
 
* Formalization of logistic regression as matching expectations between data and model
 
* Regularization and how it interacts with overfitting
 
* Regularization and how it interacts with overfitting
 
* How "sparsifying" regularization affects run-time and memory
 
* How "sparsifying" regularization affects run-time and memory
 
* What the "hash trick" is and why it should work
 
* What the "hash trick" is and why it should work

Latest revision as of 12:32, 5 March 2018

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-405 in Spring 2018.

Slides

Stochastic gradient descent:

Quiz

Readings for the Class

Optional readings

Things to Remember

  • Approach of learning by optimization
  • Optimization goal for logistic regression
  • Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
  • Updates for logistic regression, with and without regularization
  • The formal properties of sparse logistic regression
    • Whether it is exact or approximate
    • How it changes memory and time usage
  • Formalization of logistic regression as matching expectations between data and model
  • Regularization and how it interacts with overfitting
  • How "sparsifying" regularization affects run-time and memory
  • What the "hash trick" is and why it should work