Dan Cosley, AAAI, 2010
Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, Xiangyang Lan, Siddarth Suri. Sequential Influence Models in Social Network. Association for the Advancement of Artificial Intelligence. 2010.
In this paper, the authors consider two of the most fundamental deﬁnitions of inﬂuence, one based on a small set of “snapshot” observations of a social network and the other based on detailed temporal dynamics. The former is particularly useful because large-scale social network data sets are often available only in snapshots or crawls. The latter however provides a more detailed process model of how inﬂuence spreads. The authors studied the relationship between these two ways of measuring inﬂuence, in particular establishing how to infer the more detailed temporal measure from the more readily observable snapshot measure. It validates the analysis using the history of social interactions on Wikipedia; the result is the ﬁrst large-scale study to exhibit a direct relationship between snapshot and temporal models of social inﬂuence.
Brief Description Of The Method
In order to study a method to measure inﬂuence, this paper gives two definitions of social influence: Ordinal-time and Snapshot, then it defines some notation for various sets of tuples. At last, it gives a method about how to Simulating Ordinal Time from Snapshots.
- : u joined C before t1 and u had k1 neighbors in C at t1.
- : u had k1 neighbors in C at t1, u joined C between t1 and t2, u had k2 neighbors in C at t2.
- : u did not join C before t2 and u had k2 neighbors in C at t2.
The sets B and N result in shifting no and do upwards with respect to and . Also, the set J results in stretching and when compared and . Finally, the sets J and N result in becoming an accumulation or integration of .
- For the English Wikipedia data, Figure 3(a) shows the plots of and on a log-log scale, along with the best linear ﬁt. A linear model accounts relatively well for the data over a large range, suggesting a power law is a reasonable approximation for each of these quantities. This is not surprising, given that can be viewed as a variation on the standard degree distribution: it measures the distribution of the “degree” (number of edges) of each node into each community.
- Using the Wikipedia ordinal time data the authors generated two snapshots, choosing November 1, 2005 and November 6, 2006 as two relatively arbitrary moments at which which they measure the full set of community memberships. It then computed p (k) using the snapshot method, shown in Figure 3(b).
- The approximation of ordinal time data from snapshot data depends on two factors: the number of snapshots used, and the amount of time between the snapshots. The authors begin by considering the effect of the number of snapshots. They show how the simulation of ordinal-time depends on the number of snapshots taken for English Wikipedia in figure below.
As one would expect, the approximation is becoming increasingly accurate with more snapshots. This is because as the number of snapshots increases the time between them goes to 0. Thus, in the limit, snapshot measurements converge to the ordinal-time measurements. This figure shows that empirically just a few snapshots produce good results for these datasets which means the convergence occurs fairly rapidly as the number of snapshots increases.
This paper uses conclusions in Measuring wikipedia to compare with its experimental result.