Difference between revisions of "Xufei Wang, ICDM, 2010"

From Cohen Courses
Jump to navigationJump to search
Line 21: Line 21:
  
 
== Problem Statement ==
 
== Problem Statement ==
 +
 +
In this paper, the concept of community is generalized to include both users and tags. Tags of a community imply the major concern of people within it.
 +
 +
Let <math>\mu = \left ( \mu _{1},\mu _{2},...,\mu _{m} \right )</math> denote the user set, <math>\tau  = \left ( \tau _{1},\tau _{2},...,\tau _{n} \right )</math> the tay set. A community <math>C_{i}\left ( 1\leq i\leq k \right )</math> is a subset of user and tags, where k is the number of communities. As mentioned above, communities usually overlap, i.e., <math>C_{i}\bigcap C_{j}\neq \O \left ( 1\leq i,j\leq k \right )</math>.On the other hand, users and their subscribed tags form a user-tag matrix M, in which each entry <math>M_{ij}\in \left \{ 0,1 \right \}</math> indicates whether user <math>u_{i}</math> subscribes to tag <math>t_{j}</math>. So it is reasonable to view a user as a sparse vector of tags, and each tag as a sparse vector of users.
 +
 +
Given notations above, the overlapping co-clustering problem can be stated formally as follows:
 +
 +
Input:
  
 
== Brief description of the method ==
 
== Brief description of the method ==

Revision as of 23:39, 27 March 2011

Citation

Xufei Wang. 2010. Discovering Overlapping Groups in Social Media, the 10th IEEE International Conference on Data Mining (ICDM 2010).

Online Version

http://dmml.asu.edu/users/xufei/Papers/ICDM2010.pdf

Databases

BlogCatalog [1]

Delicious [2]

Summary

In this paper, the authors propose a novel co-clustering framework, which takes advantage of networking information between users and tags in social media, to discover these overlapping communities. The basic ideas are:

  • To discover overlapping communities in social media. Diverse interests and interactions that human beings can have in online social life suggest that one person often belongs more than one community.
  • To use user-tag subscription information instead of user-user links. Metadata such as tags become an important source in measuring the user-user similarity. The paper shows that more accurate community structures can be obtained by scrutinizing tag information.
  • To obtain clusters containing users and tags simultaneously. Existing co-clustering methods cluster users/tags separately. Thus, it is not clear which user cluster corresponds to which tag cluster. But the proposed method is able to find out user/tag group structure and their correspondence

Problem Statement

In this paper, the concept of community is generalized to include both users and tags. Tags of a community imply the major concern of people within it.

Let denote the user set, the tay set. A community is a subset of user and tags, where k is the number of communities. As mentioned above, communities usually overlap, i.e., .On the other hand, users and their subscribed tags form a user-tag matrix M, in which each entry indicates whether user subscribes to tag . So it is reasonable to view a user as a sparse vector of tags, and each tag as a sparse vector of users.

Given notations above, the overlapping co-clustering problem can be stated formally as follows:

Input:

Brief description of the method

Experimental Result

Related papers