Connections between the Lines: Augmenting Social Networks with Text
This Paper is available online [1].
Summary
This paper proposes a topic model based off of Latent Dirichlet Allocation for Social Networks. The main focus of this paper is to adapt probabilistic topic models to account for relationships between entities. Entities in this paper are collections of discrete data, and in the paper, they only deal with words - so an entity would be a document. In particular, this paper describes a model with a generative process to choose words based on mixtures of topics both for the words and for the relationships between entities. The focus is on network data which can be modeled with a relationship between two entities. Rather than the standard LDA model, a word can be generated from a distribution over topics for the relationship, in addition to the normal method.
Datasets
The authors evaluate their results on three different datasets of text: the Bible, Biological Scientific Abstracts, and Wikipedia.