Modeling Relational Events via Latent Classes

This a Paper discussed in Social Media Analysis 10-802 in Spring 2011.

Citation

Christopher DuBois, Padhraic Smyth. Modeling Relational Events via Latent Classes. KDD 2010

Online version

Summary

Many social network activities can be described as a series of dyadic events. An event in this paper is defined as a triple of (sender, receiver, event_type). Authors assume that such events are generated by some latent class and in the paper they proposed a graphical model to identify the latent class as well as dyadic events with the inference implementation of Graphical Models with Gibbs sampling and Expectation-Maximization methods.

Methodology

It's assumed that relational events are generated by following process:

Draw the class distribution ${\bar {\pi }}$ ~ Dirichlet( ${\bar {a}}$ )
Draw distributions:

$\theta _{c}$ ~ Dirichlet( $\beta$ )

$\phi _{c}$ ~ Dirichlet( $\gamma$ )

$\psi _{c}$ ~ Dirichlet( $\delta$ )

for all c in {1...C}

For each event

(a) Draw $c$ ~ Multinomial( ${\bar {\pi }}$ ), the event’s class

(b) Draw $s|c$ ~ Multinomial( ${\bar {\theta _{c}}}$ ), the event’s sender

(c) Draw $r|c$ ~ Multinomial( ${\bar {\phi _{c}}}$ ), the event’s receiver

(d) Draw $a|c$ ~ Multinomial( ${\bar {\psi _{c}}}$ ), the event’s type

It's not hard to work out the likelihood for the data:

Two ways of inference, Gibbs sampling and EM, are implemented in this paper.

Predicting

We can make predictions from the parameters inferred:

Data

A data set of international events involving entities from 450 countries over the 2000-2005 time period. This data has been used by political scientists to explore international relations and policy. The authors used an automated system for coding 3,575,897 events from Reuters news reports. Each of these events takes the form: [entity A] [action] [entity B]. Actions in this data set consist of 247 possible types, such as judicial action, military action, and so forth.

A quick example of what the output might be like:

Experimental Result

Uniform baseline: simply predict all events are equally likely
Multinomial baseline: predict by observed frequency for each event
And predict from the graphical model

MPMM is the model authors designed. MCGS is an algorithm used in Collapsed Gibbs Sampling process (detail omitted here). C is the number of classes. Well in a short word the graphical model one works much better than baselines.

Modeling Relational Events via Latent Classes

Contents

Citation

Online version

Summary

Methodology

Predicting

Data

Experimental Result

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools