Graphs over Time Densification Laws Shrinking Diameters and Possible Explanations
Citation
Leskovec, J. and Kleinberg, J. and Faloutsos, C. 2005. Graphs over time: densification laws, shrinking diameters and possible explanations. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (KDD’05), 177--187.
Online version
Summary
This is an influential paper presents key insights and observations about how real graphs evolve over time based on analysis of real graphs. The key findings and contributions of the paper are as follows:
- Most of the real graphs densify over time. The number of edges added to the network grows super linearly with the number of nodes added.
- The average distance between nodes shrinks over time.
- The paper proposes a new Graph Generation model (Forest Fire) that exhibits these new properties.
Brief description of the method
The authors first elaborate on densification property, stating that as graphs evolve over time, they follow a version of , where lies strictly between 1 and 2. e(t) and n(t) denote number of edges and nodes at time t, respectively. Since is strictly between 1 and 2, this indicates the existence of power laws.
The next point authors focus on is the property of shrinking parameters. Although the idea behind this property could not be answered fully in the paper, the authors present additional results to validate that their results are obtained over mature graphs and shrinking diameter property is not observed due to (i) possible sampling problems, (ii) disconnected components, (iii) missing past effects (the nodes current links in the graph refer to but not included in the dataset since the dataset is collected beyond a certain point in time), (iv) emergence of a giant component during which small disconnected groups attach together quickly to form one big component.
Then, the authors present two graph generation models. One basic model, called “Community Guided Attachment” models recursive “Communities within communities” kind of pattern. The second model called “Forest Fire” encompasses ‘rich get richer’ (heavy-tailed in-degree), ‘copying’, heavy-tailed out-degree as well as densification power laws and shrinking diameter properties.
Datasets
This paper uses 4 different datasets: (1) arXiv Citation Graph High Energy physics dataset (2) arXiv Comprehensive ( arXiv Comprehensive) (3) Patents US Patent citation dataset (4) Autonomous Systems Autonomous systems dataset