Project dong, 10-802 spring 2010
This page is a project report.
- Titel: Analyzing perspectives in an interactive setting
- Author: Dong Nguyen Property "Author" has a restricted application area and cannot be used as annotation property by a user.
Summary
This project analyzed how perspectives are displayed in text. We used political discussion data and looked at 'left' versus 'right'. First, experiments were done to compare different methods to estimate the bias in text. Then one of these methods was used to analyze the influence of interaction on perspectives in text in an online political forum.
Perspectives in text
The perspective of a speaker or author influences the text or speech he produces. A well-known example is the use of 'freedom fighter' or 'terrorist'. Estimating from which perspective a text is written is a difficult problem, since text is often on the same topic. The differences are therefore often very subtil, which makes this a hard problem.
Potential applications
- Estimating voting behavior of political persons
- Track political opinion
- Diversify search results (return documents written in different perspectives about topics of interest)
- Personalize search results (return documents in viewpoint of user)
- Etc..
Related work
There has been a variety of work on perspectives in text.
The following is an overview of techniques using machine learning techniques
- Word scores method (Extracting policy positions from political texts using words as data by Laver et al., 2003)
- Machine learning algorithms such as Naive Bayes & SVM
- Topic modeling approaches (Cross-cultural analysis of blogs and forums with mixed-collection topic models by Paul and Girju, 2009 , Joint topic and perspective model for ideological discourse by Lin et al., 2008)
The following have looked at interaction patterns (such as quoting behavior)
- Mining newsgroups using networks arising from social behavior. , Agrawal et al., 2003
- Graph-based user classification for informal online political discourse, Malouf and Mullen, 2007
The following are some linguistic papers on this topic
- Discourse semantics and ideology., van Dijk, 1995
- Ideology and discourse: a multidisciplinary introduction., van Dijk, 2003
- Intertextual borrowings in ideologically competing discourses: The case of the middle east, Kawakib Momani, 2010
Datasets
We experimented with the following datasets
Another interesting dataset which has not been used in this project but which is very related is the Bitterlemons dataset