Class meeting for 10-405 Unsupervised Learning On Graphs

From Cohen Courses
Revision as of 14:04, 23 April 2018 by Wcohen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-405 in Spring 2018.




Optional Readings

  • Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416.
  • Frank Lin and William W. Cohen (2010): Power Iteration Clustering in ICML-2010.
  • Frank Lin and William W. Cohen (2010): A Very Fast Method for Clustering Big Text Datasets in ECAI-2010.
  • Frank Lin and William W. Cohen (2011): Adaptation of Graph-Based Semi-Supervised Methods to Large-Scale Text Data in MLG-2011.
  • Ramnath Balasubramanyan, Frank Lin, and William W. Cohen (2010): Node Clustering in Graphs: An Empirical Study in NIPS-2010 Workshop on Networks Across Disciplines.

Things To Remember

  • The definitions of the graph Laplacian (D-A) and normalized Laplacian (I-W)
  • What the largest eigenvectors of W look like for a block-stochastic matrix
  • What spectral clustering is: clustering after mapping nodes in a graph to points defined by the largest K non-trivial eigenvectors of W.
  • What power iteration clustering is.
  • How to implement the "manifold trick" for PIC and SSL.
  • Why the "manifold trick" improves computational efficiency, relative to computing a K-NN graph.