Difference between revisions of "Takamura et al. 2005"

From Cohen Courses
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 23: Line 23:
  
 
where <math>x_i</math> is the spin (+1 or -1) of the <math>i</math>th electron and <math>w</math> is an <math>NxN</math> matrix representing the weights between each pair of electrons.
 
where <math>x_i</math> is the spin (+1 or -1) of the <math>i</math>th electron and <math>w</math> is an <math>NxN</math> matrix representing the weights between each pair of electrons.
 +
 +
The probability of an electron configuration is given by
  
 
<math>
 
<math>
SO(phrase)=PMI(phrase,'excellent')-PMI(phrase,'poor')
+
P(x|W) = \frac{1}{Z(W)} exp(-\Beta E(x, W))
 
</math>
 
</math>
  
We can modify the above definition to obtain the following formula:
+
where <math>Z(W)</math> is the normalization factor and <math>\Beta</math> is a hyper-parameter called the <i>inverse-temperature</i>.
 +
 
 +
Unfortunately, evaluating <math>Z(W)</math> is intractable, due to the fact that there are <math>2^N</math> possible configurations of electrons. As such Takamura et al. use a clever approximation. They seek a function <math>Q(\theta,W)</math> that is as similar to <math>P(x|W)</math> as possible. As a distance metric between the two functions they use the <i>variational free energy</i> <math>F(\theta)</math> which is defined as the difference between the mean energy with respect to <math>Q</math> and the entropy of <math>Q</math>.
 +
 
 +
This function's derivative is analytically findable, and hence given a starting value of <math>x</math> an analytic update rule can be found, and is shown in the paper.
 +
 
 +
They then require a way to compute the weighting table <math>W</math>. They do this by using their glossary of similar terms and defining <math>W_{i j} = \frac{1}{\sqrt{d(i) d(j)}}</math> where <math>d(i)</math> represents the degree of word <math>i</math>.
 +
 
 +
Finally, they discuss two methodologies for determining the hyper-parameter <math>\Beta</math>. The first is a simple leave-one-out error rate minimization method, as is standard in many machine learning problems. The second is physics-inspired and is called the <i>magnetism</i> of the system, defined by
  
 
<math>
 
<math>
SO(phrase)=log_2(\frac{hits(phrase\ NEAR\ 'excellent')hits('excellent')}{hits(phrase\ NEAR\ 'poor')hits('poor')} )
+
m = \frac{1}{N}\sum_i \bar{x_i}
 
</math>
 
</math>
  
where operator NEAR means that the two phrases should be appeared close to each other in the corpus. Using the above formula they have calculated the average semantic orientation for a review. They have shown that the value of average semantic orientation for phrases in the items that are tagged as "recommended" by the users are usually positive and those that are tagged as "not recommended" are usually negative.
+
They seek a value of <math>\Beta</math> that makes <math>m</math> positive, but as close as possible to zero. To accomplish this, they simply calculate <math>m</math> with several different values of <math>\Beta</math> and select the best one they find.
  
 
== Experimental Result ==
 
== Experimental Result ==
 
The approach described achieved accuracies from 75.2% using two seed words to 91.5% using leave-one-out cross validation. They compare their results to two previous methods for accomplishing the same task on a separate lexical graph constructed using only synonym connections. The first is the graph-based shortest-distance algorithm of Hu and Liu, which achieved a 70.8% accuracy, while Takamura et Al.'s approach achieved 73.4%. The second was Riloff et al.'s bootstrapping method which achieved 72.8%, compared to Takamura et al.'s 83.6% on that data set.
 
The approach described achieved accuracies from 75.2% using two seed words to 91.5% using leave-one-out cross validation. They compare their results to two previous methods for accomplishing the same task on a separate lexical graph constructed using only synonym connections. The first is the graph-based shortest-distance algorithm of Hu and Liu, which achieved a 70.8% accuracy, while Takamura et Al.'s approach achieved 73.4%. The second was Riloff et al.'s bootstrapping method which achieved 72.8%, compared to Takamura et al.'s 83.6% on that data set.
 
== Related papers ==
 
== Related papers ==

Latest revision as of 01:08, 27 September 2012

Citation

Hiroya Takamura, Takashi Inui, and Manabu Okumura. 2005. Extracting semantic orientations of words using spin model. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL '05).

Online version

ACL anthology

Summary

This paper follows many other sentiment analysis papers in analyzing graphs of words with synonym and antonym links to estimate the net sentiment of each word. Their estimation model, however, is a clear departure from most other work in NLP.

The fundamental idea of the paper is that sentiment of words occurring near each other (according to search engine hit counts) are likely to have similar sentiment values. They observe that this phenomenon is similar to the problem of determining the mostly likely spin states of each electron in a field of electrons.

As they describe it, on a local scale electrons near each other tend to have the same spin. To have two electrons near each other with differing spins requires some amount of energy, and as such, the goal of the optimization problem is to find the state of the electron field with the lowest possible energy. Fortunately, computational physicists have studied this spin model thoroughly. While exhaustive computation requires exponential time, they have also found tractable approximations.

Brief description of the method

The method asserts that the 'energy' of an electron system of electrons is given by

where is the spin (+1 or -1) of the th electron and is an matrix representing the weights between each pair of electrons.

The probability of an electron configuration is given by

where is the normalization factor and is a hyper-parameter called the inverse-temperature.

Unfortunately, evaluating is intractable, due to the fact that there are possible configurations of electrons. As such Takamura et al. use a clever approximation. They seek a function that is as similar to as possible. As a distance metric between the two functions they use the variational free energy which is defined as the difference between the mean energy with respect to and the entropy of .

This function's derivative is analytically findable, and hence given a starting value of an analytic update rule can be found, and is shown in the paper.

They then require a way to compute the weighting table . They do this by using their glossary of similar terms and defining where represents the degree of word .

Finally, they discuss two methodologies for determining the hyper-parameter . The first is a simple leave-one-out error rate minimization method, as is standard in many machine learning problems. The second is physics-inspired and is called the magnetism of the system, defined by

They seek a value of that makes positive, but as close as possible to zero. To accomplish this, they simply calculate with several different values of and select the best one they find.

Experimental Result

The approach described achieved accuracies from 75.2% using two seed words to 91.5% using leave-one-out cross validation. They compare their results to two previous methods for accomplishing the same task on a separate lexical graph constructed using only synonym connections. The first is the graph-based shortest-distance algorithm of Hu and Liu, which achieved a 70.8% accuracy, while Takamura et Al.'s approach achieved 73.4%. The second was Riloff et al.'s bootstrapping method which achieved 72.8%, compared to Takamura et al.'s 83.6% on that data set.

Related papers