Class meeting for 10-605 SGD and Hash Kernels

From Cohen Courses
Jump to navigationJump to search

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2017.


Stochastic gradient descent:


Readings for the Class

Optional readings

Things to Remember

  • Approach of learning by optimization
  • Optimization goal for logistic regression
  • Key terms: logistic function, sigmoid function, log conditional likelihood, loss function, stochastic gradient descent
  • Updates for logistic regression, with and without regularization
  • Formalization of logistic regression as matching expectations between data and model
  • Regularization and how it interacts with overfitting
  • How "sparsifying" regularization affects run-time and memory
  • What the "hash trick" is and why it should work