Difference between revisions of "Class meeting for 10-605 SGD and Hash Kernels"

From Cohen Courses
Jump to navigationJump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2015|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2015]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2017|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2017]].
  
 
=== Slides ===
 
=== Slides ===
Line 5: Line 5:
 
Stochastic gradient descent:
 
Stochastic gradient descent:
  
* [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pptx Slides in Powerpoint]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/2016/sgd.pptx Slides in Powerpoint]
* [http://www.cs.cmu.edu/~wcohen/10-605/sgd.pdf Slides in PDF]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/2016/sgd.pdf Slides in PDF]
  
 
=== Quiz ===
 
=== Quiz ===

Latest revision as of 12:09, 26 September 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2017.

Slides

Stochastic gradient descent:

Quiz

Readings for the Class

Optional readings

Things to Remember

  • Approach of learning by optimization
  • Optimization goal for logistic regression
  • Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
  • Updates for logistic regression, with and without regularization
  • Formalization of logistic regression as matching expectations between data and model
  • Regularization and how it interacts with overfitting
  • How "sparsifying" regularization affects run-time and memory
  • What the "hash trick" is and why it should work