Rodriguez et al Oct 2011

From Cohen Courses
Jump to navigationJump to search

Citation

Manuel Gomez Rodriguez , Jure Leskovec , Andreas Krause, Inferring networks of diffusion and influence, Oct 2011, arXiv:1006.0234 [cs.DS]

Online Version

http://arxiv.org/abs/1006.0234

Summary

This paper addresses Diffusion network inference problem. They claim that there is an underlying unknown network over which information, viruses or influence propagate. By assuming that the underlying network is static over time, they observe the times when the nodes get infected. By this, they want to determine the paths the diffusion took through the unobserved network. The paper formulates a cascade transmission model based on the independent cascade model [Kempe et al. 2003]. The paper also formulates and demonstrates the usefulness of the NETINF algorithm.

They show a proof of concept for the algorithm on a real Web information propagation dataset of 170 million blog and news articles over a one year period. The results show that online news propagation networks tend to have a core-periphery structure with a small set of core blog and news media websites that diffuse information to the rest of the Web, news media websites tend to diffuse the news faster than blogs and blogs keep discussing about news longer time than media websites. (as the paper quotes).

Dataset

They use more than 172 million news articles and blog posts from 1 million online sources over a period of one year from September 1 2008 till August 31 2009. They use memetracker to extract short textual phrases. It can be found onine here.

Study Plan

  • D. Kempe, J. M. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD ’03: Proc. of the 9th ACM SIGKDD international conference on

Knowledge discovery and data mining, pages 137–146, 2003.

For understanding the proofs:

  • G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 14(1):265–294, 1978.
  • J. Leskovec and C. Faloutsos. Scalable modeling of real graphs using kronecker multiplication. In ICML ’07: Proc. of the 24th International Conference on Machine Learning,

page 504, 2007.


  • One might have to go through a lot more resources depending on what lever of math, one is comfortable with


References

1 Supporting website. http://snap.stanford.edu/netinf.

2 Eytan Adar , Lada A. Adamic, Tracking Information Epidemics in Blogspace, Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, p.207-214, September 19-22, 2005 [doi>10.1109/WI.2005.151]

3 L. Barabási. The origin of bursts and heavy tails in human dynamics. Nature, 435:207, 2005.

4 L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999.

5 Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191):98--101, 2008.

6 Crane and D. Sornette. Robust dynamic classes revealed by measuring the response function of a social system. Proc. of the National Academy of Sciences, 105(41):15649-15653, October 2008.

7 Edmonds. Optimum branchings. Journal of Resesearch of the National Bureau of Standards, (71B):233--240, 1967.

8 Erd\Hos and A. Rényi. On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Science, 5:17--67, 1960.

9 Friedman and D. Koller. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50(1):95--125, 2003.

10 Friedman, I. Nachman, and D. Pe'er. Learning Bayesian network structure from massive datasets: The "Sparse Candidate" algorithm. In Proc. UAI, 1999.

11 Daniel Gruhl , R. Guha , David Liben-Nowell , Andrew Tomkins, Information diffusion through blogspace, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA [doi>10.1145/988672.988739]

12 Katz and P. Lazarsfeld. Personal influence: The part played by people in the flow of mass communications. Free Press, 1955.

13 David Kempe , Jon Kleinberg , Éva Tardos, Maximizing the spread of influence through a social network, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C. [doi>10.1145/956750.956769]

14 Samir Khuller , Anna Moss , Joseph (Seffi) Naor, The budgeted maximum coverage problem, Information Processing Letters, v.70 n.1, p.39-45, April 01, 1999 [doi>10.1016/S0020-0190(99)00031-9]

15 Knuth. The art of computer programming. Addison-Wesley, 1968.

16 Ravi Kumar , Jasmine Novak , Prabhakar Raghavan , Andrew Tomkins, Structure and evolution of blogspace, Communications of the ACM, v.47 n.12, December 2004 [doi>10.1145/1035134.1035162]

17 Jure Leskovec , Lada A. Adamic , Bernardo A. Huberman, The dynamics of viral marketing, Proceedings of the 7th ACM conference on Electronic commerce, p.228-237, June 11-15, 2006, Ann Arbor, Michigan, USA [doi>10.1145/1134707.1134732]

18 Jure Leskovec , Lars Backstrom , Jon Kleinberg, Meme-tracking and the dynamics of the news cycle, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France [doi>10.1145/1557019.1557077]

19 Jure Leskovec , Christos Faloutsos, Scalable modeling of real graphs using Kronecker multiplication, Proceedings of the 24th international conference on Machine learning, p.497-504, June 20-24, 2007, Corvalis, Oregon [doi>10.1145/1273496.1273559]

20 Jure Leskovec , Jon Kleinberg , Christos Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA [doi>10.1145/1081870.1081893]

21 Jure Leskovec , Andreas Krause , Carlos Guestrin , Christos Faloutsos , Jeanne VanBriesen , Natalie Glance, Cost-effective outbreak detection in networks, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA [doi>10.1145/1281192.1281239]

22 Jure Leskovec , Kevin J. Lang , Anirban Dasgupta , Michael W. Mahoney, Statistical properties of community structure in large social and information networks, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China [doi>10.1145/1367497.1367591]

23 Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SDM '07: Proc. of the SIAM Conference on Data Mining, 2007.

24 Leskovec, A. Singh, and J. M. Kleinberg. Patterns of influence in a recommendation network. In PAKDD '06: Proc. of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 380--389, 2006.

25 Liben-Nowell and J. Kleinberg. Tracing the flow of information on a global scale using Internet chain-letter data. Proc. of the National Academy of Sciences, 105(12):4633--4638, 25 Mar. 2008.

26 D. Malmgren, D. B. Stouffer, A. E. Motter, and L. A. A. N. Amaral. A poissonian explanation for heavy tails in e-mail communication. Proc. of the National Academy of Sciences, 105(47):18153--18158, November 2008.

27 Nemhauser, L. Wolsey, and M. Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 14(1):265--294, 1978.

28 M. Rogers. Diffusion of Innovations. Free Press, New York, fourth edition, 1995.

29 Wallinga and P. Teunis. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160(6):509--516, 2004.

30 J. Watts and P. S. Dodds. Influentials, networks, and public opinion formation. Journal of Consumer Research, 34(4):441--458, December 2007.