Difference between revisions of "Class meeting for 10-605 Scalable PageRank"

From Cohen Courses
Jump to navigationJump to search
(Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Data...")
 
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2014|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Spring_2014]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2016]].
  
 
=== Slides ===
 
=== Slides ===
  
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/randomized-algs-2.pptx Randomized Algorithms wrap-up], [http://www.cs.cmu.edu/~wcohen/10-605/randomized-algs-2.pdf Randomized Algorithms PDF version].
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/debugging-tips.pptx Tips on Debugging for the SGD Assignment],[http://www.cs.cmu.edu/~wcohen/10-605/debugging-tips.pdf PDF version]
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/pagerank.pptx PageRank and Scalability],  [http://www.cs.cmu.edu/~wcohen/10-605/pagerank.pdf PDF version]
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/rwr-sampling.pptx Sampling from a Graph], [http://www.cs.cmu.edu/~wcohen/10-605/rwr-sampling.pdf  PDF version]
  
* [http://www.cs.cmu.edu/~wcohen/10-605/2013/2013-03-18-rwr.pptx Powerpoint]
+
=== Quiz ===
 +
 
 +
* [https://qna-app.appspot.com/view.html?aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgICgjtDJCAw  [https://qna-app.appspot.com/view.html?aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgICgjtDJCAw]]
  
 
=== Readings ===
 
=== Readings ===
Line 10: Line 16:
 
* [http://dl.acm.org/citation.cfm?id=1150479 Samping from Large Graphs], Jure Leskovec and Christos Faloutsos, KDD 2006.
 
* [http://dl.acm.org/citation.cfm?id=1150479 Samping from Large Graphs], Jure Leskovec and Christos Faloutsos, KDD 2006.
 
* [http://www.math.ucsd.edu/~fan/wp/localpartition.pdf Local Graph Partitioning using PageRank Vectors], Andersen, Chung, Lang, FOCS 2006
 
* [http://www.math.ucsd.edu/~fan/wp/localpartition.pdf Local Graph Partitioning using PageRank Vectors], Andersen, Chung, Lang, FOCS 2006
 +
* [http://link.springer.com/chapter/10.1007/978-3-540-77004-6_13#page-1 Andersen, Reid, Fan Chung, and Kevin Lang. "Local partitioning for directed graphs using PageRank." Algorithms and Models for the Web-Graph. Springer Berlin Heidelberg, 2007. 166-178.]
 +
 +
=== Key things to remember ===
 +
 +
* How to implement graph algorithms like PageRank by streaming through a graph, under various conditions:
 +
** Vertex weights fit in memory
 +
** Vertex weights do not fit in memory
 +
* The meaning of various graph statistics: degree distribution, clustering coefficient, ...
 +
* Why sampling from a graph is non-trivial if you want to preserve properties of the graph like
 +
** Degree distribution
 +
** Homophily as measured by clustering coefficient,
 +
* What local graph partitioning is and how the PageRank-Nibble algorithm, together with sweeps to optimize conductance, can be used to approximately solve it.
 +
* The implications of the analysis of PageRank-Nibble.

Latest revision as of 13:31, 10 August 2016

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2016.

Slides

Quiz

Readings

Key things to remember

  • How to implement graph algorithms like PageRank by streaming through a graph, under various conditions:
    • Vertex weights fit in memory
    • Vertex weights do not fit in memory
  • The meaning of various graph statistics: degree distribution, clustering coefficient, ...
  • Why sampling from a graph is non-trivial if you want to preserve properties of the graph like
    • Degree distribution
    • Homophily as measured by clustering coefficient,
  • What local graph partitioning is and how the PageRank-Nibble algorithm, together with sweeps to optimize conductance, can be used to approximately solve it.
  • The implications of the analysis of PageRank-Nibble.