Class meeting for 10-405 Unsupervised Learning On Graphs

From Cohen Courses
Jump to navigationJump to search

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-405 in Spring 2018.

Slides

Quiz

Optional Readings

  • Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416.
  • Frank Lin and William W. Cohen (2010): Power Iteration Clustering in ICML-2010.
  • Frank Lin and William W. Cohen (2010): A Very Fast Method for Clustering Big Text Datasets in ECAI-2010.
  • Frank Lin and William W. Cohen (2011): Adaptation of Graph-Based Semi-Supervised Methods to Large-Scale Text Data in MLG-2011.
  • Ramnath Balasubramanyan, Frank Lin, and William W. Cohen (2010): Node Clustering in Graphs: An Empirical Study in NIPS-2010 Workshop on Networks Across Disciplines.

Things To Remember

  • The definitions of the graph Laplacian (D-A) and normalized Laplacian (I-W)
  • What the largest eigenvectors of W look like for a block-stochastic matrix
  • What spectral clustering is: clustering after mapping nodes in a graph to points defined by the largest K non-trivial eigenvectors of W.
  • What power iteration clustering is.
  • How to implement the "manifold trick" for PIC and SSL.
  • Why the "manifold trick" improves computational efficiency, relative to computing a K-NN graph.