The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email

From Cohen Courses
Revision as of 23:35, 5 November 2012 by Norii (talk | contribs) (Created page with '== Citation == McCallum, A., Corrada-Emmanuel, A., and Wang, X. The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Aca…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Citation

McCallum, A., Corrada-Emmanuel, A., and Wang, X. The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email, 2004. Technical Report UM-CS-2004-096.

Online version

The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email

Summary

Consider the problem of modeling a company's email network. Let's say Michael is a boss, Pam is his assistant, and the two both mail similar people. If we only consider the network structure of this email network, both Michael and Pam would be assigned with similar roles. Their roles as a boss and an assistant only becomes clear when we consider the language content of the emails that the two send out.

This paper builds on this idea, combining language content/topic in traditional social network analysis (where only network structure was considered). The authors extend upon the Author-Topic model, in which a topic distribution (distribution over words) exists for each author. Instead, in the Author-Recipient-Topic model (which is presented in this paper), their is a topic distribution for each author-recipient pair.

We can marginalize the author or recipient in order to see the topics a person would be likely to send or receive. This person-conditioned topic distribution can be used to calculate similarity between people.

Results

The authors conduct a qualitative analysis of the Author-Recipient-Topic model on the Enron email corpus and the McCallum email corpus, comparing it against Author-Topic model and a stochastic block model.

Art model.png


Discussion

Related papers

Study plan