Agarwal et al, ICWSM 2009
Contents
Citation
Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, Xufei Wang: A social identity approach to identify familiar strangers in a social network: in Proceedings of the 3rd International AAAI Conference of Weblogs and Social Media 2009.
author = {Nitin Agarwal Huan Liu Sudheendra Murthy Arunabha Sen Xufei Wang}, title = {A Social Identity Approach to Identify Familiar Strangers in a Social Network}, conference = {International AAAI Conference on Weblogs and Social Media}, year = {2009}, keywords = {Social Network, Blog, Familiar Stranger, Social Identity}, url = {http://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/184/565},
Summary
In this paper by Agarwal et al., authors define the problem of finding familiar strangers in a social network, discuss the importance of the problem and provide a technique that can identify such individuals.
Familiar strangers are group of people who share common interests, hobbies occupation but they don't know each other. In a social network setting, it is important to identify such people since the aggregated data can be used towards better predictive and personalization models and trend analysis techniques. It is also good for the individuals, since once they know the people who they share common interests with, they can expand their social network, and have new business opportunities.
To formulate the problem, they represent each individual as a vertex in a network and they associate one or more attributes drawn from a domain to describe the vertex. Then the problem is reduced to search the network to find the vertices that have similar attributes with respect to a given goal.
They compare four methods: Integer Linear Programming solution to the Steiner Tree problem , Exhaustive Search, Random search and their approach, Social Identity Based search.
Steiner Tree method can identify familiar strangers in a network, since it has a global view of the entire network. Both exhaustive search and steiner tree problem obtain %100 percent accuracy but they have exponential cost. Random search does not identify familiar strangers accurately but it is faster. Social identity based search on the other hand, can achieve a descent accuracy without sacrificing too much on the run time.
Datasets
They compare the four approaches on the following two datasets, which display social network characteristics such as power law degree distribution and small world properties.
- DBLP
Conclusion
The social identity technique that they borrow to solve the problem is vaguely defined: It is based on the assumption that people cluster their contacts, so they apply clustering as a preprocessing step. They also don't report the run time of the clustering.
Related Works and Papers
Although there has been research to solve the exact problem in this formulation per se, there are method which solve related problems. In the Shen, 2006 paper authors try to predict topics for bloggers. Based on the topic similarity and the blog content, they detect similar bloggers. Also Girvan and Newman, 2002 and Newman, 2006 came up with two different metrics to cluster a social network. First one, measures edge betweenness based on the observation that communities have lower edge betweenness value if they are loosely connected. Second one, measures modularity in a network by looking at the in links. Both approaches do not consider the egocentric view, or they don't give attributes to vertices.
References
1- Shen D., Sun J.-T., Yang Q. and Chen Z. 2006. Latent friend mining from blog data. In ICDM’06 2- Girvan, M. and Newman M. 2002. Community structure in social and biological networks. PNAS 99 (12) 3- Newman, M. E. J. 2006. Modularity and community structure in networks. PNAS 103 (23) 8577-8582