Difference between revisions of "A Latent Variable Model for Geographic Lexical Variation"
From Cohen Courses
Jump to navigationJump to searchLine 6: | Line 6: | ||
== Summary == | == Summary == | ||
− | This [[Category::paper]] aims to analyze the variation in the usage of words in vernacular wrt geography. In particular, it analyzes lexical variation by both topic and geography. It also separates regions into coherent linguistic communities. Also it can predict with some accuracy the location of the author from raw text. | + | This [[Category::paper]] aims to [[AddressesProblems::analyze the variation in the usage of words in vernacular wrt geography]]. In particular, it analyzes lexical variation by both topic and geography. It also separates regions into coherent linguistic communities. Also it can predict with some accuracy the location of the author from raw text. |
== Data == | == Data == | ||
− | This work is based on the [[UsesDataset::Twitter]] dataset which can be found [http://www.ark.cs.cmu.edu/GeoText/ here]. | + | This work is based on the [[UsesDataset::Twitter]] dataset which can be found [http://www.ark.cs.cmu.edu/GeoText/ here]. Only GeoTagged data is used. Also they choose users based on certain criterias like |
Revision as of 00:39, 27 September 2012
Contents
Citation
A Latent Variable Model for Geographic Lexical Variation. Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, and Eric P. Xing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), Cambridge, MA, October 2010.
Online version
Summary
This paper aims to analyze the variation in the usage of words in vernacular wrt geography. In particular, it analyzes lexical variation by both topic and geography. It also separates regions into coherent linguistic communities. Also it can predict with some accuracy the location of the author from raw text.
Data
This work is based on the Twitter dataset which can be found here. Only GeoTagged data is used. Also they choose users based on certain criterias like