Inﬂuentials, Networks, and Public Opinion Formation

Citation

Watts, Duncan J., and Peter Sheridan Dodds. "Influentials, networks, and public opinion formation." Journal of consumer research 34.4 (2007): 441-458.

Online version

[1]

Problem

The problem is to study the process of public opinion formation. Specifically it casts a challenge to the classical "two-step" flow model which claims the minority of influentials (a.k.a. opinion leaders) play crucial role in diffusing information between the media and the majority society and the author argues that the responsibility of the influentials of the two-step flow is considerably overestimate and in fact the influentials are only modestly more important than average individuals.

Idea

The basic idea is to qualitatively study the problem on their proposed models to validate (or invalidate) the claims in two-step flow model.

Method

The proposed model assumes that each individual makes a binary decision e.g. agree or disagree whose decision would be affected by the decision of the others (called positive externalities). As a result, the information is diffused on a so-called influence network where each person is a node and the directed edge between node a and b indicates a's decision influence the decision of b. The model is different from the two-step model in two aspects: the opinion of the majorities could in turn affect the opinion of the influential and it takes more steps to propagate through the proposed model than through the two-step model which in fact only requires two steps of propagation.

Fig.1 Tracking 50 largest threads

Fig.2 Tracking 50 largest threads

Important Assumptions

The author made some important assumptions in their basic model

The edges in the influence graph are randomly selected.
Given a node $i\,$ , the number of its neighbors $n_{i}\,$ is drawn from an influence distribution i.e. Poisson distribution.
The derision of an individual is determined by a piece-wise threshold function.

Basic Model

The basic model is established on the above three assumptions. The dynamics of influence is as follows: randomly active a node from all inactive nodes in the graph then the active node would trigger activations of its neighbors and generate a sequence of activations called cascade. finally repeat the process with different initial activation node and parameter in Possion distribution.

the authors observe that the size of the cascades depends more on the graph structure (precisely the graph density) than on the degree (influence) of the individual node.
they also point out that the influentials tend to trigger larger cascade than average individuals especially as early adopters whereas the importance of their role is marginal compared to the network structure.

Since the assumptions in the basic model might be too strong to hold in practice, the authors proposes three variation models to relax the assumption.

Variation 1

In this model, they change the influence distribution. With the fixed graph density $n_{avg}=3\;$ and different variance in the influence distribution, they find in the high-variance network some nodes enjoy very high degree which are called "hyperinfluentials". However, the cascade triggered by the hyperinfluentials is less successful than those triggered in low-variance networks. The result is NOT surprising as they said since with the graph density fixed a low-variance influence distribution would more likely to generate a well-connected network which allows for larger cascade; in contrast the high-variance networks usually consists of several connected components which restrict the size of the largest cascade.

Variation 2

In this model, they construct the influence network with groups or communities rather than individuals. There are two types of the networks: integrated and concentrated networks and both types support that their conclusion about the diffusion. Besides, the early adopters are always below the average influence of influentials which suggests that the group structure may not affecr the ability of individuals to trigger or sustain the cascades.

Variation 3

In this model, they adopt a smooth SIR (Susceptible, Infected and Recovered) function instead of the piece-wise threshold function. Although different functions lead to very different dynamics the propagation, their conclusions are largely unchanged i.e. the influentials tend to trigger marginally larger cascade than average individuals especially for SIR function.

Data set

No real-world dataset is presented in the paper and all datasets used in the study are synthetic.

Conclusions

The importance of influential is weigh less than the influence network structure and the influence function. When the global combination of conditions exits, a large cascade would be triggered by any spark whereas in absence of these conditions, non will suffice. It is prudent to attribute the information propagation to some influentials as described in the two-step model.

Notes

[2] Support website

[3] J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, M. Hurst. Cascading behavior in large blog graphs.SDM’07.

[4] X. Wang and A. McCallum. Topics over time: a non-markov continuous-time model of topical trends.Proc. KDD, 2006.

[5] X. Wang, C. Zhai, X. Hu, R. Sproat. Mining correlated bursty topic patterns from coordinated text streams.KDD, 2007.

Inﬂuentials, Networks, and Public Opinion Formation

Contents