Difference between revisions of "Compare Link Propagation Papers"

From Cohen Courses
Jump to navigationJump to search
Line 1: Line 1:
 
== Papers ==
 
== Papers ==
 
[[ Link propagation: A fast semi-supervised learning algorithm for link prediction ]], Kashima et al., SDM 2009
 
[[ Link propagation: A fast semi-supervised learning algorithm for link prediction ]], Kashima et al., SDM 2009
 +
 
[[ Fast and scalable algorithms for semi-supervised link prediction on static and dynamic graphs ]], Raymond et al. ECML-PKDD 2010.
 
[[ Fast and scalable algorithms for semi-supervised link prediction on static and dynamic graphs ]], Raymond et al. ECML-PKDD 2010.
  
 
== Method ==
 
== Method ==
The first paper introduces the algorithm of "Link Propagation" which is a means of doing link prediction between nodes of a Graph, using the eigen-decomposition of its Laplacian.  
+
The first paper is essentially a Tensor Completion problem, where  the algorithm of "Link Propagation" which is a means of doing link prediction between nodes of a Graph, using the eigen-decomposition of its Laplacian, is extended to the case where we have a dynamic graph, which can be represented as a tensor.
 +
 
 
The second paper is an extension fo the first, in the sense that it builds up on the previous algoritm, introducing an approximate version thereof which operates faster and its performance is on par with the original one. The approximate algorithm relies on Matrix Decompositions, such as the Cholesky decomposition/factorization, and in particular
 
The second paper is an extension fo the first, in the sense that it builds up on the previous algoritm, introducing an approximate version thereof which operates faster and its performance is on par with the original one. The approximate algorithm relies on Matrix Decompositions, such as the Cholesky decomposition/factorization, and in particular
 
its main idea is the approximation of the Laplacian of the Graph in a lower rank.
 
its main idea is the approximation of the Laplacian of the Graph in a lower rank.
Line 19: Line 21:
  
 
== Problem ==
 
== Problem ==
The problem in the first paper is merely the one of predicting missing links between nodes.
+
The problem in the first paper is merely the one of predicting missing links between nodes, in the setting of a time evolving graph which can be represented as a tensor.
  
 
However, this is not the main focus of the second paper, even though it builds up on the same algorithm. The second paper is merely concerned with scaling up "Link Propagation" by introducing approximation in the computations, therefore it is more algorithmic oriented.
 
However, this is not the main focus of the second paper, even though it builds up on the same algorithm. The second paper is merely concerned with scaling up "Link Propagation" by introducing approximation in the computations, therefore it is more algorithmic oriented.
Line 25: Line 27:
 
== Big Idea ==
 
== Big Idea ==
  
 +
The big idea of the first paper is the utilization of auxiliary information in order to aid tensor completion (i.e. link prediction of missing links in a time evolving graph)
 +
 +
The big idea of the second paper is the scaling up of the Link Propagation algorithm both in the matrix (static graph) and tensor (dynamic graph) case, by introducing approximate steps in the computation process.
  
 
== Other ==
 
== Other ==
 +
 +
Both papers seem to be complementary of each other and of very high technical quality. The experimental evaluation of the first paper appears to be more thorough than the one in the second paper, although this could be from the fact that the SDM conference page limit is higher than the ECML-PKDD one (and the page margins of ECML-PKDD are not very helpful for that either).
 +
 +
== Questions ==
 +
*How much time did you spend reading the (new, non-wikified) paper you summarized?
 +
**1.5 hour
 +
 +
*How much time did you spend reading the old wikified paper?
 +
**45 minutes
 +
 +
*How much time did you spend reading the summary of the old paper?
 +
**15 minutes
 +
 +
*How much time did you spend reading background material?
 +
**30 minutes
 +
 +
*Was there a study plan for the old paper?
 +
**Yes
 +
 +
*If so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them?
 +
**No.

Revision as of 13:24, 5 November 2012

Papers

Link propagation: A fast semi-supervised learning algorithm for link prediction , Kashima et al., SDM 2009

Fast and scalable algorithms for semi-supervised link prediction on static and dynamic graphs , Raymond et al. ECML-PKDD 2010.

Method

The first paper is essentially a Tensor Completion problem, where the algorithm of "Link Propagation" which is a means of doing link prediction between nodes of a Graph, using the eigen-decomposition of its Laplacian, is extended to the case where we have a dynamic graph, which can be represented as a tensor.

The second paper is an extension fo the first, in the sense that it builds up on the previous algoritm, introducing an approximate version thereof which operates faster and its performance is on par with the original one. The approximate algorithm relies on Matrix Decompositions, such as the Cholesky decomposition/factorization, and in particular its main idea is the approximation of the Laplacian of the Graph in a lower rank.

Dataset Used

The first paper uses a relatively large variety of different datasets, i.e.:

C. von Mering, R. Krause, B. Snel, M. Cornell, S. Olivier,S. Fields, and P. Bork. Comparative assessment of large-scale data sets of protein-protein interactions. Nature, 417:399–403, 2002

  • A social network dataset: here

The second paper uses multiple versions (of varying size) of the Web Spam Challenge dataset.

Problem

The problem in the first paper is merely the one of predicting missing links between nodes, in the setting of a time evolving graph which can be represented as a tensor.

However, this is not the main focus of the second paper, even though it builds up on the same algorithm. The second paper is merely concerned with scaling up "Link Propagation" by introducing approximation in the computations, therefore it is more algorithmic oriented.

Big Idea

The big idea of the first paper is the utilization of auxiliary information in order to aid tensor completion (i.e. link prediction of missing links in a time evolving graph)

The big idea of the second paper is the scaling up of the Link Propagation algorithm both in the matrix (static graph) and tensor (dynamic graph) case, by introducing approximate steps in the computation process.

Other

Both papers seem to be complementary of each other and of very high technical quality. The experimental evaluation of the first paper appears to be more thorough than the one in the second paper, although this could be from the fact that the SDM conference page limit is higher than the ECML-PKDD one (and the page margins of ECML-PKDD are not very helpful for that either).

Questions

  • How much time did you spend reading the (new, non-wikified) paper you summarized?
    • 1.5 hour
  • How much time did you spend reading the old wikified paper?
    • 45 minutes
  • How much time did you spend reading the summary of the old paper?
    • 15 minutes
  • How much time did you spend reading background material?
    • 30 minutes
  • Was there a study plan for the old paper?
    • Yes
  • If so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them?
    • No.