Inferring the Diffusion and Evolution of Topics in Social Communities

From Cohen Courses
Revision as of 21:56, 5 November 2012 by Bliu1 (talk | contribs) (Created page with '== Citation == Cindy Xide Lin, Qiaozhu Mei, Yunliang Jiang, Jiawei Han, and Shanxiang Qi, "Inferring the Diffusion and Evolution of Topics in Social Communities", Proc. of 2011 …')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Citation

Cindy Xide Lin, Qiaozhu Mei, Yunliang Jiang, Jiawei Han, and Shanxiang Qi, "Inferring the Diffusion and Evolution of Topics in Social Communities", Proc. of 2011 ACM SIGKDD Workshop on Social Network Mining and Analysis (SNAKDD'11), San Diego, CA, Aug. 2011.

PDF: [[1]]

Abstract from the paper

The prevailing of Web 2.0 techniques has led to the boom of various online communities, where topics are spreading ubiquitously among user-generated documents. Together with this diffusion process is the content evolution of the topics, where novel contents are introduced in by documents which adopt the topic. Unlike an explicit user behavior (e.g., buying a DVD), both the diffusion paths and the evolutionary process of a topic are implicit, making them much more challenging to be discovered. In this paper, we aim to simultaneously track the evolution of any arbitrary topic and reveal the latent diffusion paths of that topic in a social community. A novel and principled probabilistic model is proposed which casts our task as an joint inference problem, taking into consideration of textual documents, social influences, and topic evolution in a unified way. Specifically, a mixture model is introduced to model the generation of text according to the diffusion and the evolution of the topic, while the whole diffusion process is regularized with user-level social influences through a Gaussian Markov Random Field. Experiments on both synthetic data and real world data show that the discovery of topic diffusion and evolution benefits from this joint inference; and the probabilistic model we propose performs significantly better than existing methods.