Comparison: A Latent Variable Model for Geographic Lexical Variation and A probabilistic approach to spatiotemporal theme pattern mining on weblogs

Papers

Problem

Jacob et al. aims to analyze the variation in the usage of words in vernacular wrt geography. In particular, it analyzes lexical variation by both topic and geography. It also separates regions into coherent linguistic communities. Also it can predict with some accuracy the location of the author from raw text.

Q. Mei et al. aims to analyze webblogs by analyzing their spatiotemporal petterns. In particular, it addresses the problem that former approaches in finding subtopics for weblogs only considering either spatial information or temporal information.

Method

Jacob et al. use an enhanced edition of LDA by considering location information in modeling word distribution and assigning a probabilistic model for location and document.

Q. Mei et al. designed their model based on pLSI and give no probabilistic model for document.

Dataset Used

Jacob et al. use GeoTagged Twitter Dataset and Q. Mei et al. use Hurricane Katrina,Hurricane Rita,IPod Nano

Big Idea

These two papers are different in all three above aspects, i.e problem addressed, methods, dataset used.

Problem: Jacob et al. try to find topics that related to a specific user by incorporating its location information while Q. Mei et al. aim at finding subtopics in different time and locations from documents that have the same topics.

Method: Jacob et al. use a LDA type model while Q. Mei et al. adapt a pLSI type mehtod.

Dataset: Because of different problem they address, the data set they used are very different. Jacob et al. use twitter type documents, which are very short. Q. Mei use Weblogs, which are relative long.

Other Discussions

It would be interesting to apply the methods used in Jacob et al. to the problems that Q. Mei et al. try to address since LDA claims that it is better than pLSI.

Comparison: A Latent Variable Model for Geographic Lexical Variation and A probabilistic approach to spatiotemporal theme pattern mining on weblogs

Contents

Papers

Problem

Method

Dataset Used

Big Idea

Other Discussions

Other Questions

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools