10-601 Matrix Factorization

Murphy Chap 12. PCA is not covered in Mitchell.
There are also some notes on PCA/SVD that I've written up.
There's a nice description of the gradient-based approach to MF, and a scheme for parallelizing it,by Gemulla et al.

You should know:

What PCA is, and how it relates to matrix factorization.
How to interpret the "cartoons" that we use to illustrate PCA.
What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is.
How the principle components are related to each other and the data:
- The earlier components have the highest variance (i.e., for the first components the examples, when re-expressed over the space defined by the new basis, have the largest variance)
- The components are orthogonal to each other (by construction)
How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques.
- How to interpret the prototypes in the case of dimension reduction for images.
- How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix.
How PCA and MF relate to k-means and and EM.

Navigation menu