Difference between revisions of "10-601 Matrix Factorization"
From Cohen Courses
Jump to navigationJump to search (→Slides) |
|||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | This a lecture used in the [[Syllabus for Machine Learning 10- | + | This a lecture used in the [[Syllabus for Machine Learning 10-601B in Spring 2016]] |
=== Slides === | === Slides === | ||
− | * | + | * [http://www.cs.cmu.edu/~wcohen/10-601/pca+mf.pdf Slides in PDF]. |
=== Readings === | === Readings === | ||
− | * PCA is not covered in Mitchell. | + | * Murphy Chap 12. PCA is not covered in Mitchell. |
* There are also some [http://www.cs.cmu.edu/~wcohen/10-601/PCA-notes/pca.pdf notes on PCA/SVD] that I've written up. | * There are also some [http://www.cs.cmu.edu/~wcohen/10-601/PCA-notes/pca.pdf notes on PCA/SVD] that I've written up. | ||
* There's a nice description of [http://people.mpi-inf.mpg.de/~rgemulla/publications/rj10481rev.pdf the gradient-based approach to MF], and a scheme for parallelizing it,by Gemulla et al. | * There's a nice description of [http://people.mpi-inf.mpg.de/~rgemulla/publications/rj10481rev.pdf the gradient-based approach to MF], and a scheme for parallelizing it,by Gemulla et al. | ||
Line 15: | Line 15: | ||
You should know: | You should know: | ||
* What PCA is, and how it relates to matrix factorization. | * What PCA is, and how it relates to matrix factorization. | ||
+ | * How to interpret the "cartoons" that we use to illustrate PCA. | ||
* What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is. | * What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is. | ||
+ | * How the principle components are related to each other and the data: | ||
+ | ** The earlier components have the highest variance (i.e., for the first components the examples, when re-expressed over the space defined by the new basis, have the largest variance) | ||
+ | ** The components are orthogonal to each other (by construction) | ||
* How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques. | * How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques. | ||
** How to interpret the prototypes in the case of dimension reduction for images. | ** How to interpret the prototypes in the case of dimension reduction for images. | ||
** How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix. | ** How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix. | ||
* How PCA and MF relate to k-means and and EM. | * How PCA and MF relate to k-means and and EM. |
Latest revision as of 14:33, 21 April 2016
This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016
Slides
Readings
- Murphy Chap 12. PCA is not covered in Mitchell.
- There are also some notes on PCA/SVD that I've written up.
- There's a nice description of the gradient-based approach to MF, and a scheme for parallelizing it,by Gemulla et al.
Summary
You should know:
- What PCA is, and how it relates to matrix factorization.
- How to interpret the "cartoons" that we use to illustrate PCA.
- What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is.
- How the principle components are related to each other and the data:
- The earlier components have the highest variance (i.e., for the first components the examples, when re-expressed over the space defined by the new basis, have the largest variance)
- The components are orthogonal to each other (by construction)
- How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques.
- How to interpret the prototypes in the case of dimension reduction for images.
- How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix.
- How PCA and MF relate to k-means and and EM.