Inferring Networks of Diffusion and Influence

From Cohen Courses
Jump to navigationJump to search

Comments

  • Most of the proposal comes from the reference page [1].
 Please add sections about Task, Evaluation and Key technical challenges. 
  • Add more details about your proposed improvement over the existing approach, and how will you evaluate it.
  • Add at least 2-3 related paper working on similar problem or same dataset.

--Bbd 02:43, 11 October 2012 (UTC)

Team Members

Zaid Sheikh

Project Idea

[Project idea taken from: http://www.cs.cmu.edu/afs/.cs.cmu.edu/Web/People/epxing/Class/10701/project.html ]

Information diffusion and virus propagation are fundamental processes taking place in networks. In many applications, the underlying network over which the diffusions and propagations spread is hard to find. Finding such underlying network using MemeTracker data would be an interesting and challenging project. Gomez-Rodriguez et al. (2010) have recently published a paper on this topic, and made their code publically accessible. In this project, we would first like to replicate their results.

Furthermore, the algorithm proposed in the above paper (called NetInf) assumes that all connected nodes in the network influence their neighbors with the same probability. We would like to improve on this by observing how meme phrases mutate over time and using this information to more accurately estimate the influence probabilities.

Data

The dataset used by NETINF is called MemeTracker. It can be downloaded from http://memetracker.org/data.html .

MemeTracker contains two datasets. The first one is a phrase cluster data. For each phrase cluster the data contains all the phrases in the cluster and a list of URLs where the phrases appeared. The second is the raw MemeTracker phrase data, which contains phrases and hyper-links extracted from each article/blogpost.

References