Difference between revisions of "Mrinmaya et. al. WWW'12"

Latest revision as of 13:14, 2 October 2012

This is a scientific paper authored by Mrinmaya Sachan, and appeared in WWW'12. Below is the paper summary written by Tuan Anh.

Citation

@inproceedings{Sachan:2012:UCI:2187836.2187882,

author = {Sachan, Mrinmaya and Contractor, Danish and Faruquie, Tanveer A. and Subramaniam, L. Venkata},
title = {Using content and interactions for discovering communities in social networks},
booktitle = {Proceedings of the 21st international conference on World Wide Web},
series = {WWW '12},
year = {2012},
isbn = {978-1-4503-1229-5},
location = {Lyon, France},
pages = {331--340},
numpages = {10},
url = {http://doi.acm.org/10.1145/2187836.2187882},
doi = {10.1145/2187836.2187882},
acmid = {2187882},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {community detection, probabilistic methods, social networks},

}

Online Version

Using Content and Interactions for Discovering Communities in Social Networks.

Summary

In this paper, the authors study the problem of communities detection in social networks. They employ the probabilistic approach and propose a generative model that describes how users' exchanged messages and interactions are generated from the hidden membership of each user. The general model, or ``full model" as called by the author, has the generative process as follows.

For each of the topics, $1\leq z\leq Z$ , sample topic $z$ as a $V$ dimensional multinomial distribution over words ${\vec {\lambda _{z}}}\sim Dir_{V}(\delta )$
For each of the communities, $1\leq c\leq C$ sample social type interaction $c$ as a $X$ dimensional multinomial distribution over type of interactions ${\vec {\phi _{c}}}\sim Dir_{X}(\beta )$
For each of the communities, $1\leq c\leq C$ sample social type interaction recipient $c$ as a $U$ dimensional multinomial distribution over set of users ${\vec {\xi _{c}}}\sim Dir_{U}(\epsilon )$
For the each user $u_{i}$ $u_{i}$ , $i=1,\cdots ,U$ $i=1,\cdots ,U$
- Sample a $C$ dimensional multinomial ${\vec {\theta _{u_{i}}}}\sim Dir_{C}(\alpha )$ , representing the community proportions for that sender.
- For each community $c\in C$ , sample a $Z$ dimensional multinomial, ${\vec {\zeta _{u_{i};c}}}\sim Dir_{Z}(\nu )$ , representing the topic proportions for community and sender.
- For each post $p$ $p$ $(1\leq p\leq P_{i})$ $(1\leq p\leq P_{i})$ generated by the sender $u_{i}$ $u_{i}$ having $N_{p}$ $N_{p}$ words:
  - Choose a community assignment $c_{p}\sim Mult({\vec {\theta _{u_{i}}}})$ for all $cp\in [1:C]$ for the post.
  - For each recipient slot $i$ , $1\leq i\leq R_{p}$ of the post $p$ : Choose a recipient $r_{p}\sim Mult({\vec {\xi _{c_{p}}}})$ for all $r_{p_{i}}\in [1:R_{p}]$ for the post.
  - Choose a social interaction type $X_{p}\sim Mult({\vec {\phi _{c_{p}}}})$ for all $X_{p}\in [1:X]$ for the post.
  - For each word slot $j$ $j$ $1\leq j\leq N_{p}$ $1\leq j\leq N_{p}$ in $p$ $p$ :
    - Choose a topic assignment $z\sim Mult({\vec {\zeta _{u_{i};c_{p}}}})$ for all $z\in [1:Z]$
    - Choose a word $w_{j}\sim Mult({\vec {\lambda _{z_{w_{j}}}}})$

In the model presented above, $Z$ and $C$ is the number of topics and the number of communities respectively. They are user defined parameters and should be given before hand. $\nu ,\alpha ,\beta ,\delta ,\epsilon$ are hyper-parameters and should also given before hand. The other parameters are estimated using Gibbs sampling method.

Dicussion

This is yet another paper on topic modeling based network clustering. The underlying assumption is that when a user writes to other users, the topic is decided by the community of the sender and the sender herself; and the recipient and the type of message (reply, forward, etc) are decided by the community of the sender only. This make the transform from user-topic similarity to user-community similarity is not straightforward.

Related papers

Original work on LDA: Blei. et. al. Latent Dirichlet Allocation. Journal of Machine Learning Research 3 (2003) 993-1022
Gregor Heinrich's note on Parameter estimation for text analysis which clearly show how to use Gibbs sampling method to do inference in LDA based models
Probabilistic graph clustering: Airoldi. et. al. Mixed Membership Stochastic Blockmodels. Journal of Machine Learning Research 9 (2008) 1981-2014
Simlar work by McCallum et. al on modeling user - recipient - topic

@@ Line 24: / Line 24: @@
 == Summary ==
-In this paper, the authors study the problem of communities detection in social networks. They employ the probabilistic approach and propose a generative model that describes how users' messages and interactions are generated from the hidden membership of each user. The general model, or ``full model" as called by the author, has the generative process as follows.
+In this paper, the authors study the problem of communities detection in social networks. They employ the probabilistic approach and propose a generative model that describes how users' exchanged messages and interactions are generated from the hidden membership of each user. The general model, or ``full model" as called by the author, has the generative process as follows.
-*For each of the topics, 1 � z � Z, sample a V dimensional multinomial,
+*For each of the topics, <math>1 \leq z \leq Z</math>, sample topic <math>z</math> as a <math>V</math> dimensional multinomial distribution over words <math>\vec{\lambda_z}\sim  Dir_V (\delta) </math>
- <math>~�z � DirV (�\theta) </math>
+*For each of the communities, <math>1\leq c \leq C</math> sample social type interaction <math>c</math> as a<math>X</math> dimensional multinomial distribution over type of interactions <math>\vec{\phi_c} \sim Dir_X(\beta)</math>
-. For each of the communities, 1 � c � C sample a
+*For each of the communities, <math>1\leq c \leq C</math> sample social type interaction recipient <math>c</math> as a<math>U</math> dimensional multinomial distribution over set of users  <math>\vec{\xi_c} \sim Dir_U(\epsilon)</math>
-X dimensional social type interaction mixture �~
+*For the each user <math>u_i</math>, <math>i = 1,\cdots, U</math>
-c �
+** Sample a <math>C</math> dimensional multinomial<math>\vec{\theta_{u_i}} \sim Dir_C(\alpha)</math>, representing the community proportions for that sender.
-DirX(�).
+** For each community <math>c \in C</math>, sample a <math>Z</math> dimensional multinomial, <math>\vec{\zeta_{u_i;c}} \sim Dir_Z(\nu)</math>, representing the topic proportions for community and sender.
-. For each of the communities, 1 � c � C sample a U
+** For each post <math>p</math> <math>(1 \leq p \leq P_i)</math> generated by the sender <math>u_i</math> having <math>N_p</math> words:
-dimensional social recipient interaction mixture  ~
+***Choose a community assignment <math>c_p \sim Mult(\vec{\theta_{u_i}})</math> for all <math>cp \in [1 : C]</math> for the post.
-c �
+*** For each recipient slot <math>i</math>, <math>1 \leq i \leq R_p</math> of the post <math>p</math>: Choose a recipient <math>r_p \sim Mult( \vec{\xi_{c_p}})</math> for all <math>r_{p_i} \in [1 : R_p]</math> for the post.
-DirU (�).
+*** Choose a social interaction type <math>X_p \sim Mult(\vec{\phi_{c_p}})</math> for all <math>X_p \in [1 : X]</math> for the post.
-. For the i
+***For each word slot <math>j</math> <math>1 \leq j \leq N_p</math> in <math>p</math>:
-th
+****Choose a topic assignment <math>z \sim Mult(\vec{\zeta_{u_i;c_p}})</math> for all <math>z \in [1 : Z]</math>
-user ui, 1 � ui � U:
+****Choose a word <math>w_j \sim Mult(\vec{\lambda_{z_{w_j}}})</math>
-(a) Sample a C dimensional multinomial,
-~
+In the model presented above, <math>Z</math> and <math>C</math> is the number of topics and the number of communities respectively. They are user defined parameters and should be given before hand. <math>\nu, \alpha, \beta, \delta, \epsilon</math> are hyper-parameters and should also given before hand. The other parameters are estimated using [http://en.wikipedia.org/wiki/Gibbs_sampling Gibbs sampling] method.
-�ui � DirC(�),
-representing the community proportions for that
-sender.
-(b) For each community c 2 C, sample a Z dimen-
-sional multinomial, ~�ui;c � DirZ(�), representing
-the topic proportions for community and sender.
-(c) For each post p (1 � p � Pi) generated by the
-sender ui: having Np words:
-i. Choose a community assignment cp � Mult(
-~
-�ui
-)
-cp 2 [1 : C] for the post.
-ii. For each recipient slot i, 1 � i � Rp of the
-post p:
-A. Choose a recipient rp � Mult( ~
-cp
-)
-rpi 2 [1 : Rp] for the post.
 == Dicussion ==
+This is yet another paper on topic modeling based network clustering. The underlying assumption is that when a user writes to other users, the topic is decided by the community of the sender and the sender herself; and the recipient and the type of message (reply, forward, etc) are decided by the community of the sender only. This make the transform from user-topic similarity to user-community similarity is not straightforward.
 == Related papers ==
-*Airoldi. et. al. [http://jmlr.csail.mit.edu/papers/volume9/airoldi08a/airoldi08a.pdf Mixed Membership Stochastic Blockmodels]. Journal of Machine Learning Research 9 (2008) 1981-2014
+*Original work on LDA: Blei. et. al. [http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf Latent Dirichlet Allocation]. Journal of Machine Learning Research 3 (2003) 993-1022
-*Blei. et. al. [http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf Latent Dirichlet Allocation]. Journal of Machine Learning Research 3 (2003) 993-1022
+*Gregor Heinrich's note on [http://www.arbylon.net/publications/text-est.pdf Parameter estimation for text analysis] which clearly show how to use Gibbs sampling method to do inference in LDA based models
-*Gregor Heinrich's note on [http://www.arbylon.net/publications/text-est.pdf Parameter estimation for text analysis]
+*Probabilistic graph clustering: Airoldi. et. al. [http://jmlr.csail.mit.edu/papers/volume9/airoldi08a/airoldi08a.pdf Mixed Membership Stochastic Blockmodels]. Journal of Machine Learning Research 9 (2008) 1981-2014
+*Simlar work by McCallum et. al on [http://people.cs.umass.edu/~mccallum/papers/art04tr.pdf modeling user - recipient - topic]

Difference between revisions of "Mrinmaya et. al. WWW'12"

Latest revision as of 13:14, 2 October 2012

Contents

Citation

Online Version

Summary

Dicussion

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools