Difference between revisions of "Project 2nd draft Derry Reyyan"

From Cohen Courses
Jump to navigationJump to search
Line 11: Line 11:
 
'''Understanding change'''
 
'''Understanding change'''
  
Given an entity of interest, we would like to model and analyze its change (in terms of words and phrases that co-occur with it) over time.  
+
Given an entity of interest, we would like to model and analyze its change in terms of words and phrases that co-occur with it over time.  
  
 
We propose to construct a social graph, but instead of people, we put words as nodes and edges are weighted based on number of co-occurrence between the words. Using this social graph of words, we propose to analyze:  
 
We propose to construct a social graph, but instead of people, we put words as nodes and edges are weighted based on number of co-occurrence between the words. Using this social graph of words, we propose to analyze:  
  
How co-occurrence with other words influences the meaning or the sentiment associated with the word. For example, the word 'BP' frequently co-occurred with negatively associated words during and after the Gulf-spill event.
+
(1) how co-occurrence with other words change over time
 +
(2) how the change influences the state (semantic or sentiment) associated with the entity
 +
(3) how the change may correspond to events that occur during the same period of time
 +
 
 +
For example, the entity 'BP' frequently co-occurred with negatively associated words during and after the Gulf-spill event.
  
 
== Dataset ==
 
== Dataset ==
Line 23: Line 27:
 
== Motivation ==
 
== Motivation ==
  
How does the semantic of a word or sentiment associated with it change over time depending on its neighbor (i.e. co-occurring words/phrases)? Does such change relate to a particular event that happens in the same period of time? Can we find a natural sequence of events that define a change of state/semantic/sentiment of a particular entity?
+
How does the semantic or sentiment associated with a given entity change over time depending on its neighbor (i.e. co-occurring words/phrases)?  
 +
 
 +
Does such change relate to a particular event that happens in the same period of time?  
 +
 
 +
Can we find a natural sequence of events that define a change of state (semantic or sentiment) of a particular entity?
  
 
== Techniques ==
 
== Techniques ==
  
 
For each of the ideas above, proposed techniques or related papers are (in order of the ideas):
 
For each of the ideas above, proposed techniques or related papers are (in order of the ideas):
 
• Clustering of opinions. Finding when a group of opinions break into two in time (to detect the time '''''t''''' where a change in opinion occurs, followed by the grow of another group of opinion cluster). Topic modeling of news document to pinpoint the particular event at that time '''''t''''' that may cause the change. Related recent paper: [http://upinion.cse.buffalo.edu/beta/SOMApaper.pdf Identifying Breakpoints in Public Opinion].
 
 
• Using centrality and betweenness measures in social network analysis, but applied to a network of opinions (Related paper: [http://onlinelibrary.wiley.com/doi/10.1002/asi.20614/pdf Betweenness Centrality as an Indicator of the Interdisciplinarity of Scientific Journals]). Random walk on the graph to find ring leaders and clusters of opinions. Schelling segregation to measure spatial segregation (we first need to define what 'space' means in the graph of opinions). A related paper to segregation in graph is [http://www.nejm.org/doi/pdf/10.1056/NEJMsa0706154 The Collective Dynamics of Smoking in a Large Social Network].
 
  
 
• Regression analysis to measure tendency of a word to become negative in meaning over time, when co-occurred with negative words (Related paper: [http://www.nejm.org/doi/pdf/10.1056/NEJMsa066082 The Spread of Obesity in a Large Social Network over 32 Years] - applied to measuring the spread of negativity in a network of words).  
 
• Regression analysis to measure tendency of a word to become negative in meaning over time, when co-occurred with negative words (Related paper: [http://www.nejm.org/doi/pdf/10.1056/NEJMsa066082 The Spread of Obesity in a Large Social Network over 32 Years] - applied to measuring the spread of negativity in a network of words).  
 
• Using Bayes rule to measure probability of two people having a link in Twitter based on their friends links and opinions and spatial-temporal overlap. An interesting relation to a recent paper [http://www.pnas.org/content/early/2010/12/02/1006155107.full.pdf Inferring social ties from geographic coincidences].
 
  
 
== Evaluation ==
 
== Evaluation ==
  
A combination of manual evaluation and cross validation (splitting the data into training and testing and evaluate) may be done.
+
A combination of manual evaluation and cross validation (splitting the data into training and testing and evaluate) may be done.
 
 
== Superpowers ==
 
 
 
• Nothing really at the moment, except for a bag full of ideas and a lot of keenness in pursuing at least one of them well.
 

Revision as of 19:02, 14 February 2011

Social Media Analysis Project Ideas

Team Members

Derry Wijaya [dwijaya@cs.cmu.edu]

Reyyan Yeniterzi [reyyan@cs.cmu.edu]

Project Idea

Understanding change

Given an entity of interest, we would like to model and analyze its change in terms of words and phrases that co-occur with it over time.

We propose to construct a social graph, but instead of people, we put words as nodes and edges are weighted based on number of co-occurrence between the words. Using this social graph of words, we propose to analyze:

(1) how co-occurrence with other words change over time (2) how the change influences the state (semantic or sentiment) associated with the entity (3) how the change may correspond to events that occur during the same period of time

For example, the entity 'BP' frequently co-occurred with negatively associated words during and after the Gulf-spill event.

Dataset

Google Books Ngram Data.

Motivation

How does the semantic or sentiment associated with a given entity change over time depending on its neighbor (i.e. co-occurring words/phrases)?

Does such change relate to a particular event that happens in the same period of time?

Can we find a natural sequence of events that define a change of state (semantic or sentiment) of a particular entity?

Techniques

For each of the ideas above, proposed techniques or related papers are (in order of the ideas):

• Regression analysis to measure tendency of a word to become negative in meaning over time, when co-occurred with negative words (Related paper: The Spread of Obesity in a Large Social Network over 32 Years - applied to measuring the spread of negativity in a network of words).

Evaluation

A combination of manual evaluation and cross validation (splitting the data into training and testing and evaluate) may be done.