10-601 Wrap-up on Linear Classification
This a lecture used in the Syllabus for Machine Learning 10-601
To summarize this part of the course, we looked at a four algorithms in some detail.
We derived naive Bayes by starting with strong independence assumptions, which lead to an optimization problem that you can solve in closed form. This gives us a super-fast method that runs in one pass through the data.
We derived logistic regression by starting with a functional form for a classifier - a linear classifier - and then constructing a reasonable-seeming optimization criterion. This gives us another optimization problem, which can't be solved in closed form, but can be solved with off-the-shelf optimization techniques.
We didn't derive the perceptron algorithm - we started with a simple update rule, and then analyzed how it performs in a particular formal model. However, we noted a connection between the conditions for success for the perceptron and the another optimization criterion, which leads to support vector machines. We also talked about the "kernelized" version of the perceptron.
- To be added. One thing to discuss: the relation between the voted perceptron and SGD.
What You Should Know Afterward
- What a linear classifier is, and why all the methods discussed here are linear classifiers.
- What a margin-based classifier is.
- What optimization metrics are associated with naive Bayes, logistic regression, and SVMs.