Difference between revisions of "10-601 Matrix Factorization"

From Cohen Courses
Jump to navigationJump to search
 
Line 3: Line 3:
 
=== Slides ===
 
=== Slides ===
  
* [http://www.cs.cmu.edu/~wcohen/10-601/cf.pptx  Slides in Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-601/cf.pdf  Slides in PDF].
+
* [http://www.cs.cmu.edu/~wcohen/10-601/pca+mf.pdf  Slides in PDF].
 
 
* Poll: [https://piazza.com/class/ij382zqa2572hc]
 
  
 
=== Readings ===
 
=== Readings ===
  
 
+
* Murphy Chap 12. PCA is not covered in Mitchell.   
Matrix factorization and collaborative filtering is not covered in Murphy or Mitchell.  Some external readings are below.
+
* There are also some [http://www.cs.cmu.edu/~wcohen/10-601/PCA-notes/pca.pdf notes on PCA/SVD] that I've written up.
* Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 8 (2009): 30-37.
 
 
* There's a nice description of [http://people.mpi-inf.mpg.de/~rgemulla/publications/rj10481rev.pdf the gradient-based approach to MF], and a scheme for parallelizing it,by Gemulla et al.
 
* There's a nice description of [http://people.mpi-inf.mpg.de/~rgemulla/publications/rj10481rev.pdf the gradient-based approach to MF], and a scheme for parallelizing it,by Gemulla et al.
  
Line 17: Line 14:
  
 
You should know:
 
You should know:
* What social recommendations systems are, and how they relate to matrix factorization.
+
* What PCA is, and how it relates to matrix factorization.
* How to solve MF via gradient descent.
+
* How to interpret the "cartoons" that we use to illustrate PCA.
* How matrix factorization is related to PCA and k-means.
+
* What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is.
 +
* How the principle components are related to each other and the data:
 +
** The earlier components have the highest variance (i.e., for the first components the examples, when re-expressed over the space defined by the new basis, have the largest variance)
 +
** The components are orthogonal to each other (by construction)
 +
* How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques.
 +
** How to interpret the prototypes in the case of dimension reduction for images.
 +
** How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix.
 +
* How PCA and MF relate to k-means and and EM.

Latest revision as of 15:33, 21 April 2016

This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016

Slides

Readings

Summary

You should know:

  • What PCA is, and how it relates to matrix factorization.
  • How to interpret the "cartoons" that we use to illustrate PCA.
  • What loss function and constraints are associated with PCA - i.e., what the "PCA Problem" is.
  • How the principle components are related to each other and the data:
    • The earlier components have the highest variance (i.e., for the first components the examples, when re-expressed over the space defined by the new basis, have the largest variance)
    • The components are orthogonal to each other (by construction)
  • How to interpret the low-dimensional embedding of instances, and the "prototypes" produced by PCA and MF techniques.
    • How to interpret the prototypes in the case of dimension reduction for images.
    • How to interpret the prototypes in the case of collaborative filtering, and completion of a ratings matrix.
  • How PCA and MF relate to k-means and and EM.