Paper:Choi Y. and C. Cardie., ACL 2010

From Cohen Courses
Revision as of 18:20, 29 October 2011 by Taruns (talk | contribs)
Jump to navigationJump to search

Citation

Choi, Y., and C. Cardie. Hierarchical Sequential Learning for Extracting Opinions and their Attributes. ACL-2010

Online Version

Here is the online version of the paper.

Summary

In this paper the authors apply a hierarchical parameter sharing technique using Conditional Random Fields to fine grained opinion analysis. Using this approach the authors were able to jointly detect the boundaries of opinion expressions as well as their two key attributes - polarity and intensity. Previous approaches treated the task of identifying opinion expression, polarity and intensity as separate problems. Hence, errors from individual components propagated in systems with cascaded component architectures, causing performance degradation.

The authors posed the problem of joint extraction of opinion expressions and their attributes as a sequence tagging task, wherein, given a sequence of tokens they predict a sequence of labels. The labels here are defined as conjunctive values of polarity and intensity labels, which are derived from a hierarchical structure of classes (see figure below).

Hierarchicy.png

The benefits of hierarchical construction is that similar labels can share the same subcomponents of feature and weight vectors and the number of sets of parameters are significantly reduced.

Brief description of the method

Given a sequence of tokens, ... , we predict a sequence of labels, ... . The conditional probability p(y|x) for CRFs is given as

Crfeq.png

To apply a hierarchical sharing technique, the parameters are expanded as follows

Eq1.png

where and are feature vectors for Opinion extraction, and are feature vectors for Polarity extraction, and and are feature vectors for Strength (or Intensity) extraction, and

Eq2.png

The features of the form: and are the Per-Token features which can be common to all attributes or attribute specific. The features of the form and are the Transition Features employed to help with boundary extraction

Experimental Result

The system was evaluated on the Multi Perspective Question Answering (MPQA) corpus at the opinion entity level rather than token level. The authors investigated three options

  • Polarity-Only Intensity-Only: Combining the results from two separate tagging CRFs - 1) Opinion expression extraction for polarity attribute, and 2) Opinion expression extraction for intensity attribute.
  • Joint without Hierarchy: Linear-chain CRFs without exploiting class hierarchy.
  • Joint with Hierarchy: With the hierarchical sequential learning.

Precision, Recall and F-measure was used as the evaluation metrics.. Following Table summarizes the performance:

Table1.png

The authors showed that the simple joint sequential tagging even without exploiting the hierarchy brings better performance than combining two separate models. In addition, the hierarchical joint sequential learning brings a further performance gain.

Related papers

The hierarchical parameter sharing technique has been previously used by Zhao et al. (2008) for opinion analysis, but only to classify sentence-level attributes.