Paper:Choi Y. and C. Cardie., ACL 2010
Contents
Citation
Choi, Y., and C. Cardie. Hierarchical Sequential Learning for Extracting Opinions and their Attributes. ACL-2010
Online Version
Here is the online version of the paper.
Summary
In this paper the authors apply a hierarchical parameter sharing technique using Conditional Random Fields to fine grained opinion analysis. Using this approach the authors were able to jointly detect the boundaries of opinion expressions as well as their two key attributes - polarity and intensity. Previous approaches treated the task of identifying opinion expression, polarity and intensity as separate problems. Hence, errors from individual components propagated in systems with cascaded component architectures, causing performance degradation.
The authors posed the problem of joint extraction of opinion expressions and their attributes as a sequence tagging task, wherein, given a sequence of tokens they predict a sequence of labels. The labels here are defined as conjunctive values of polarity and intensity labels, which are derived from a hierarchical structure of classes (see figure below).
The benefits of hierarchical construction is that similar labels can share the same subcomponents of feature and weight vectors and the number of sets of parameters are significantly reduced.
Brief description of the method
Given a sequence of tokens, ... , we predict a sequence of labels, ... . The conditional probability p(y|x) for CRFs is given as
To apply a hierarchical sharing technique, the parameters are expanded as follows
where and are feature vectors for Opinion extraction, and are feature vectors for Polarity extraction, and and are feature vectors for Strength (or Intensity) extraction, and
The features of the form: and are the Per-Token features which can be common to all attributes or attribute specific. The features of the form and are the Transition Features employed to help with boundary extraction
Experimental Result
The system was evaluated on the Multi Perspective Question Answering (MPQA) corpus at the opinion entity level rather than token level. The authors investigated three options
- Polarity-Only Intensity-Only: Combining the results from two separate tagging CRFs - 1) Opinion expression extraction for polarity attribute, and 2) Opinion expression extraction for intensity attribute.
- Joint without Hierarchy: Linear-chain CRFs without exploiting class hierarchy.
- Joint with Hierarchy: With the hierarchical sequential learning.
Precision, Recall and F-measure was used as the evaluation metrics.. Following Table summarizes the performance:
The authors showed that the simple joint sequential tagging even without exploiting the hierarchy brings better performance than combining two separate models. In addition, the hierarchical joint sequential learning brings a further performance gain.
Related papers
The hierarchical parameter sharing technique has been previously used by Zhao et al. (2008) for opinion analysis, but only to classify sentence-level attributes.