Difference between revisions of "Supervised Random Walk"

Revision as of 16:11, 31 March 2011

This is one of the paper discussed and written in course Social Media Analysis 10-802 in Spring 2011

Citation

Lars Backstrom & Jure Leskovec "Supervised Random Walks: Predicting and Recommending Links in Social Networks"

Online version

Summary

This paper addresses the problem of Link Prediction using the method of Random walk with restart. Supervised Random Walk is interesting because it ranks the nodes based the network information and also using rich node and edge attributes that exist in the dataset. The method is supervised learning task where the goal is to learn the parameters of the function that assigns the strength of the edge (probability of taking that edge) such that a random walker is more likely to reach nodes to which new links will be created in future.

This method is used to recommend friends on Facebook dataset and also in predicting links in collaboration network in arXive database.

Method

Typically, Random walk with restart involves giving probabilities to every edge, which indicate the probability of taking that edge by a random walker given he is at any one of the node on either side of the edge. These probabilities decide which nodes are closer to the node from which we restart our random walks. One simple method that is to assign each edge out of a given node equal probability. However, supervised random walk presented in this paper provides the method to learn the probabilities so that the random walker restarting from node s is more likely to reach the "positive" nodes than "negative" nodes. Positive nodes are the one for which the links were formed in the training dataset and negative nodes are rest all nodes which are not connected the nodes s. The method is formulated into an optimization problem for which an efficient estimation procedure is derived.

Problem Formulation

Given a graph $G=(V,E)$ and the training data which has set of $D$ the positive nodes for which the links were formed by s and all other nodes in the graph to which linked were not formed as $L$ . Assume that we are learning the probability assignments for edges with respect to a single s. Each edge $(u,v)$ has a vector of features $\psi _{uv}$ , which could be features of the interaction between nodes $u,v$ or could be features on individual nodes $u,v$ . Also, if there is a function $f_{w}(\psi _{uv})=a_{uv}$ which is parameterized function that assigns edge strengths or probabilities $a_{uv}$ given feature vector $\psi _{uv}$ . The problem formulation is to learn the parameter, $w$ that such that if we do random walk with restarts from s on the graph where the edge strengths are assigned based $f_{w}$ , the random walk station distribution $p$ has property that for each $d\in D,l\in L\ \ p_{l}<p_{d}$ . This means that are more likely to reach the positive nodes $D$ than nodes in negative set $L$ . Hence, the optimization problem is to regularize the parameter vector satisfying this condition, which is given below.

\min _{w}F(w)=||w||^{2}

such that

\forall d\in D,l\in L:p_{l}<p_{d}

The above formulation is the "hard" constraint that the stationary probabilities of all negative scores should be less than that of a positive node. The constraint is relaxed by introducing an error function $h$ which is such that $h(p_{l}-p_{d})=0$ if $p_{l}-p_{d}<0$ however $h(p_{l}-p_{d})>0$ in case $p_{l}-p_{d}>0$ where the constraint is violated. With this "soft" constraints the following is the an optimization problem formulation

\min _{w}F(w)=||w||^{2}+\lambda \sum _{d\in D,l\in L}h(p_{l}-p_{d})

In the above formulation $\lambda$ is regularization parameter that trades-off between the complexity (measured in norm of $w$ ) for the fit of the model ( how many constraints are violated).

Solving Optimization Problem

In order to solve the optimization problem formulated, the expression is differentiated. Firstly, the loss function $h$ has to be differential and secondly, we need to derive the relationship between parameters $w$ and random walk scores $p$ . The edge strength $a_{uv}=f_{w}(\psi _{uv})$ then the stochastic transition matrix $Q'$ is given as following

\displaystyle {\begin{array}{lcll}Q'_{uv}&=&{\frac {a_{uv}}{\sum _{w}a_{uw}}}\ &if\ (u,v)\ \in E\\&=&0\ &otherwise\\\end{array}}

From this stochastic transition matrix $Q'$ we can obtain the transition probability matrix $Q$ when we do Random walk with restart. The following $\alpha$ is the restart probability.

Q_{uv}=(1-\alpha )Q'_{uv}+\alpha \mathbf {1} (v==s)

Given this transition probability matrix, the stationary distribution $p$ of Random walk with restart is given by eigenvector equation.

p^{T}=p^{T}Q

With this we can find the relation ship between parameters $w$ and stationary distribution $p$ . Hence if we differentiate the optimization problem we get the following equation. We can consider $\delta _{ld}=p_{l}-p_{d}$

{\frac {\partial F(w)}{\partial w}}=2w+\sum _{l,d}{\frac {\partial h(p_{l}-p_{d})}{\partial w}}

{\frac {\partial F(w)}{\partial w}}=2w+\sum _{l,d}{\frac {\partial h(\delta _{ld})}{\partial \delta _{ld}}}({\frac {\partial p_{l}}{\partial p_{w}}}-{\frac {\partial p_{d}}{\partial p_{w}}})

Now since we know $p$ is principal eigen vector for the transition probability matrix we have $p_{u}=\sum _{j}p_{j}Q_{ju}$ and hence

{\frac {\partial p_{u}}{\partial w}}=\sum _{j}Q_{ju}{\frac {\partial p_{j}}{\partial w}}+p_{j}{\frac {\partial Q_{ju}}{\partial u}}

In the above equation ${\frac {\partial p_{u}}{\partial w}}$ can be computed by Power Iteration method and ${\frac {\partial Q_{ju}}{\partial u}}$ can be computed using the definition of transition probability matrix.

With this gradient descent method can be applied and $F(w)$ is minimized.

Datasets Used

This paper for link prediction and link recommendation uses the following two datasets

Facebook dataset is dataset of Facebook users in Iceland. They chose Iceland, as they have seen the penetration was highest in Iceland and hence new links were formed more rapidly than other countries.
Collaboration network in arXive database.

Experiments

Experimental Setup

Experimental Results

Synthetic dataset

Real world dataset

Conclusion

If the number of nodes are really high, we need to consider the nodes that are not very far in terms of number of hops as potential candidates so that the method does not bloat up the size of transition matrix.

@@ Line 14: / Line 14: @@
 Typically, [[UsesMethod:: Random walk with restart]] involves giving probabilities to every edge, which indicate the probability of taking that edge by a random walker given he is at any one of the node on either side of the edge. These probabilities decide which nodes are closer to the node from which we restart our random walks. One simple method that is to assign each edge out of a given node equal probability. However, ''supervised random walk'' presented in this paper provides the method to learn the probabilities so that the random walker restarting from node ''s'' is more likely to reach the "positive" nodes than "negative" nodes. Positive nodes are the one for which the links were formed in the training dataset and negative nodes are rest all nodes which are not connected the nodes ''s''. The method is formulated into an optimization problem for which an efficient estimation procedure is derived.
 === Problem Formulation ===
-Given a graph <math> G = (V,E) </math> and the training data which has set of <math>D</math> the positive nodes for which the links were formed by ''s'' and all other nodes in the graph to which linked were not formed as <math>L</math>. Assume that we are learning the probability assignments for edges with respect to a single ''s''. Each edge <math>(u,v) </math> has a vector of features <math> \psi_{uv}</math>, which could be features of the interaction between nodes <math> u, v</math> or could be features on individual nodes <math>u,v</math>. Also, if there is a function <math> f_w (\psi_{uv}) = a_{uv}</math> which is parameterized function that assigns edge strengths or probabilities <math>a_{uv}</math> given feature vector <math>\psi_{uv}</math>. The problem formulation is to learn the parameter, <math>w</math> that such that if we do random walk with restarts from ''s'' on the graph where the edge strengths are assigned based <math>f_w</math>, the random walk station distribution <math>p</math> has property that for each <math>d \in D, l \in L p_l < p_d </math>. This means that are more likely to reach the positive nodes <math>D</math> than nodes in negative set <math>L</math>. Hence, the optimization problem is to regularize the parameter vector satisfying this condition, which is given below.
+Given a graph <math> G = (V,E) </math> and the training data which has set of <math>D</math> the positive nodes for which the links were formed by ''s'' and all other nodes in the graph to which linked were not formed as <math>L</math>. Assume that we are learning the probability assignments for edges with respect to a single ''s''. Each edge <math>(u,v) </math> has a vector of features <math> \psi_{uv}</math>, which could be features of the interaction between nodes <math> u, v</math> or could be features on individual nodes <math>u,v</math>. Also, if there is a function <math> f_w (\psi_{uv}) = a_{uv}</math> which is parameterized function that assigns edge strengths or probabilities <math>a_{uv}</math> given feature vector <math>\psi_{uv}</math>. The problem formulation is to learn the parameter, <math>w</math> that such that if we do random walk with restarts from ''s'' on the graph where the edge strengths are assigned based <math>f_w</math>, the random walk station distribution <math>p</math> has property that for each <math>d \in D, l \in L\ \ p_l < p_d </math>. This means that are more likely to reach the positive nodes <math>D</math> than nodes in negative set <math>L</math>. Hence, the optimization problem is to regularize the parameter vector satisfying this condition, which is given below.
 :<math>
 \min_w F(w) = ||w||^2

Difference between revisions of "Supervised Random Walk"

Revision as of 16:11, 31 March 2011

Contents