Difference between revisions of "Xiang et al., 2010,Modeling Relationship Strength in Online Social Networks"

From Cohen Courses
Jump to navigationJump to search
m (Created page with '== Online version == An online version of this paper is available at the [[http://portal.acm.org/ft_gateway.cfm?id=1718518&type=pdf&CFID=7653019&CFTOKEN=17876787 ACM digital lib…')
 
Line 5: Line 5:
 
== Summary ==
 
== Summary ==
  
[[AddressesProblem::Influence propagation]] in social networks has interesting applications, especially for viral marketing.  Most past studies assume as input a graph with nodes for each person, and weighted edges between the nodes if there is influence between the two persons. However, less attention has been put on how to build this graph using social media data. This [[Category::paper]] introduces a model of influence built using social graph data on one hand, and a log of action (e.g., joining a community) on the other hand. The model is validated on the [[UsesDataset::Flick dataset]], which consists in a social graph with 1.3M nodes/40M edges and action log of 300K distinct actions. They propose and evaluate both static and dynamic models for this problem, and show that the influence of users on others (e.g., influence another user to join a group) can be modeled with a high accuracy.
+
This [[category:paper]] focuses on [[UsesMethod:unsupervised learning]] for [[AddressesProblem::Determining Social Network Attributes]], more specifically, link strength in social network. Previous work focusing on friendship relations mostly assumes binary  relation (connected or not connected). However, the authors argue that real-life network is a more complicated environment, where acquaintances and best-friends relations are mixed together. They develop an unsupervised model to estimate the strenght of these relations by using features such as bi-directional communication as well as user similarity.  Their approach is evaluated on Facebook, and shows an improved classification accuracy.
  
 
== Key Contributions ==
 
== Key Contributions ==
  
The biggest contribution claimed by the authors in this paper is the empirical evidence of influence in social networks using a discrete time model.
+
Ways of predicting negative relations in social networks.
 +
 
 +
== Background ==
 +
The authors refer to two main theories that have been proposed in psychology to explain the possible causes of negative links: balance and status.  Balance is a concept that includes transitional relationships (e.g., "the friend of my friend is my friend"), as well as other variants (e.g., "the friend of my enemy is my enemy").  This balance might explain how negative links are created in social networks (e.g., especially if relationships are public).  The concept of status is that every human conceives other humans as having either a higher or lower status.  Such idea can be represented with (u,v) links, where the existence of such links means that u considers v of higher status.  Once again, these status relations might influence the characterization of "friend" or "enemy" (e.g., one prefers to be friend with another of higher status).
  
 
== Models ==
 
== Models ==
  
=== Static Model ===
+
=== [[UsesMethod:: Logistic Regression]] ===  
Everytime a user tries to activate (i.e. influence) his neighbor, he has a probability Pi of succeedingThis can be modeled using a Bernouilli distribution.
+
The task of predicting edge sign (positive or negative) is accomplished by a logistic regression model that uses the following features:
 +
#signed degree of a node (i.e., how many positive edges it has)
 +
#"type" of triads formed by (u,v and w) in a way that the node w has an edge to both u and v.  For example, a negative/negative relations exists if w foe with both u and v (thus u and v might be friends, as explained below)There are 16 types of such triads.
  
=== Continuous Time ===
 
The previous model neglect the variation over time of the influence users have on each others.  For this reason, Goyal et al, present a continuous time model where the probability of a user influencing a neighbor depends on a time t variable (e.g., similar to an exponential decay model).
 
  
=== Discrete Time ===
+
== Experiments and Evaluation ==
The above continuous model isn't truly continuous has it has to be simulated using increment on the variable t.  This process requires long run time for testing.  For this reason, the authors present a discrete time model which assumes that the probability of a user influencing his neighbors his constant over a certain window of time after an action.
 
  
== Experiments and Evaluation ==
+
The model is compared to a random approach, where positive and negative are randomly selected. For this approach to be valid, the authors sample from the SlashDot dataset to obtain 50% of positive and negative.  The effects of the different features on the logistic regression are shown in the following results graph:
  
The different models are evaluated with a ROC curve with true positive rate and false positive rate varying for different threshold values. The threshold value controls how much influence a user need to have from his neighbors in order to do the action (e.g., join a group).  The result show that the static model fail to perform as well as the two temporally aware models, which probably indicates that influence does vary over time in social networks.  Also, although discrete and continuous time models show similar performance, the discrete model has a faster run time.
+
[[File:Results_predicting_negative.JPG]] Source: the original paper
  
[[File:Result_learning_influence_probabilities.jpg]] Source: the original paper
+
In this graph, A corresponds to predicting Epinions "distrust" relationships, B corresponds to SlashDot "foes" relationship and C corresponds to a user voting down another user for adminship in Wikipedia. The results indicate that it is possible to predicting the polarity of a link with high accuracy compared to a random baseline.  This results seems to generalize to multiple dataset.

Revision as of 18:52, 25 March 2011

Online version

An online version of this paper is available at the [ACM digital library].

Summary

This focuses on UsesMethod:unsupervised learning for Determining Social Network Attributes, more specifically, link strength in social network. Previous work focusing on friendship relations mostly assumes binary relation (connected or not connected). However, the authors argue that real-life network is a more complicated environment, where acquaintances and best-friends relations are mixed together. They develop an unsupervised model to estimate the strenght of these relations by using features such as bi-directional communication as well as user similarity. Their approach is evaluated on Facebook, and shows an improved classification accuracy.

Key Contributions

Ways of predicting negative relations in social networks.

Background

The authors refer to two main theories that have been proposed in psychology to explain the possible causes of negative links: balance and status. Balance is a concept that includes transitional relationships (e.g., "the friend of my friend is my friend"), as well as other variants (e.g., "the friend of my enemy is my enemy"). This balance might explain how negative links are created in social networks (e.g., especially if relationships are public). The concept of status is that every human conceives other humans as having either a higher or lower status. Such idea can be represented with (u,v) links, where the existence of such links means that u considers v of higher status. Once again, these status relations might influence the characterization of "friend" or "enemy" (e.g., one prefers to be friend with another of higher status).

Models

Logistic Regression

The task of predicting edge sign (positive or negative) is accomplished by a logistic regression model that uses the following features:

  1. signed degree of a node (i.e., how many positive edges it has)
  2. "type" of triads formed by (u,v and w) in a way that the node w has an edge to both u and v. For example, a negative/negative relations exists if w foe with both u and v (thus u and v might be friends, as explained below). There are 16 types of such triads.


Experiments and Evaluation

The model is compared to a random approach, where positive and negative are randomly selected. For this approach to be valid, the authors sample from the SlashDot dataset to obtain 50% of positive and negative. The effects of the different features on the logistic regression are shown in the following results graph:

Results predicting negative.JPG Source: the original paper

In this graph, A corresponds to predicting Epinions "distrust" relationships, B corresponds to SlashDot "foes" relationship and C corresponds to a user voting down another user for adminship in Wikipedia. The results indicate that it is possible to predicting the polarity of a link with high accuracy compared to a random baseline. This results seems to generalize to multiple dataset.