Supervised Random Walks: Predicting and Recommending Links in Social Networks
Supervised Random Walks: Predicting and Recommending Links in Social Networks
Contents
Citation
Backstrom, Lars and Leskovec, Jure Supervised random walks: predicting and recommending links in social networks WSDM '11
Online version
Summary
This paper tries to solve the dilemma for link prediction between unsupervised methods which are sometimes too heuristic and lead to uncontrollable performance and some supervised machine learning approaches which have great challenge in scalability.
The key idea of this paper is to adopt a random walk framework and train the transition probability to generate a higher visiting frequency on the positive nodes than negative nodes in the training set. From the trained transition probability, new nodes with high visiting frequency is the output as promising new links.
Optimization Function
For each edge , there is a corresponding feature vector that describes the nodes feature (age, gender, etc.) and the interaction attributes (e.g., when the edge was created). And we have a function which as parameterized by and computes the transition probability for . Considering starting from a single node s and means the probability of random walk from s to stop at i with all the transition probabilities. And the optimization problem now becomes:
where D is the positive training sets and L is negative. is to overcome the over fitting problem.
Data Sets and Experiments
- Synthetic data
The paper generates synthetic graphs to test whether the algorithm can successfully recover if the graph is generated using the random walk way. And the result shows almost 100% accuracy with noise-free data and a still high accuracy with some noise.
- Four co-authorship networks from arXiv e-print archive.
The AUC score is 0.71238 and the best unsupervised method gets 0.638 and best other supervised method gets 0.674.
- Facebook data of Iceland
The AUC score is 0.828 and unsupervised random walk(equal weighted) gets an AUC score 0.817 and best supervised method gets 0.817
Related papers
Liben-Nowell Kleinberg J. Am.Soc.Inf.Sci.2007 studies extensively on the unsupervised topological features for link prediction
Fast dynamic reranking in large graphs and Link precition in relational data discuss some supervised frameworks for link prediction.
Study Plan
- Slides about personalized random walk