Difference between revisions of "Project Ideas - Derry, Reyyan"

From Cohen Courses
Jump to navigationJump to search
Line 38: Line 38:
 
• How often bloggers start writing about a topic for the first time after reading about a related event in the news?
 
• How often bloggers start writing about a topic for the first time after reading about a related event in the news?
  
•  
+
It will be interesting to find out whether centrality and betweenness apply to a graph of opinions. A graph can be constructed where each node is a piece of opinion and the edges are similarities between the opinions. Can we then find in the graph, which opinion(s) is(are) the ringleaders? Are there neutral or indecisive opinions that act as go-between between different groups of opinions? How cohesive are the groups of opinions? How does the graph change overtime? Are there spatial segregation in the graph (where minority opinions) are pushed to the periphery of the graph?
  
 
•  
 
•  

Revision as of 21:54, 31 January 2011

Social Media Analysis Project Ideas

Team Members

Derry Wijaya [dwijaya@cs.cmu.edu]

Reyyan Yeniterzi [reyyan@cs.cmu.edu]

Project Ideas and Questions We'd Like to Answer

We have several possible ideas for the project:

• We propose to do a mapping of event to opinion. An event can be social or political in nature, which brings about a change in opinion or vice versa.

• We propose to analyze opinions from the perspective of associative sorting and social contagion. For example, to answer question on when does an opinion get pushed aside? i.e. centrality and periphery of opinions in the opinion-graph.

• We propose to construct a social graph, but instead of people, we put words as nodes. Using this social graph of words, we propose to analyze: (1) how co-occurrence with other words (associativity with other words) can influence meaning of words (for example, the word 'BP' was frequently 'associated' (co-occurred) with negative words during and after the Gulf-spill event), (2) how new words emerge in the graph (like ‘Google’), or a new part of speech (like 'googling'), (3) how meaning and usage of words like “LOL” changes with time - from meaning “laughing out loud”, to “whatever”.

• We propose to automatically create social graph on opinions from tweets, where nodes are people, links are follower/following relations, colors are attributes (positive or negative towards the entity we are interested in: like ‘Toyota’, ‘Ford’, etc)

Dataset

For each of the ideas above, we propose to use (in order of the ideas):

• The dataset of the TREC Blog Track: Blog08 corpus and TRC2 (news) corpus that are from the same time-span.

Yano & Smith blog dataset or politics.com dataset or U.S. Floor debates dataset

• News data such as TRC2 (news) corpus

• Twitter data (perhaps GeoText data)

Motivation

For each of the ideas above, our motivations are (in order of the ideas):

• How can an event reported in a news article change a blogger's opinion on the related topic? • How often bloggers start writing about a topic for the first time after reading about a related event in the news?

• It will be interesting to find out whether centrality and betweenness apply to a graph of opinions. A graph can be constructed where each node is a piece of opinion and the edges are similarities between the opinions. Can we then find in the graph, which opinion(s) is(are) the ringleaders? Are there neutral or indecisive opinions that act as go-between between different groups of opinions? How cohesive are the groups of opinions? How does the graph change overtime? Are there spatial segregation in the graph (where minority opinions) are pushed to the periphery of the graph?

Techniques

For each of the ideas above, possible techniques we can use are (in order of the ideas):

Evaluation

For each of the ideas above, possible evaluation techniques are (in order of the ideas):

Superpowers