Extracting Opinion Expressions with semi-Markov Conditional Random Fields

From Cohen Courses
Revision as of 11:36, 29 September 2012 by Ydalal (talk | contribs)
Jump to navigationJump to search

Citation

 author    = {Yang, Bishan  and  Cardie, Claire},
 title     = {Extracting Opinion Expressions with semi-Markov Conditional Random Fields},
 booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
 month     = {July},
 year      = {2012},
 address   = {Jeju Island, Korea},
 publisher = {Association for Computational Linguistics},
 pages     = {1335--1345},

Online version

ACLWEB 2012

Summary

This paper proposes a segment level sequence labeling approach using paper: semi-CRF. the development of SentiWordNet, a lexical resource in which each WordNet synset s is associated to three numerical scores Obj(s), Pos(s), Neg(s) used to describe how objective, positive and negative the terms contained in the synset are.

The motivation behind this research is to aid Opinion mining by providing an off the shelf lexical resource that provides a granular level of opinion tags for a large set of words.

The development method of SentiWordNet is an adaptation of PN-polarity [Esuli and Sebastiani, 2005] and SO-polarity [Esuli and Sebastiani, 2006] identification methods. The proposed method uses a set of ternary classifiers, capable of deciding whether a synset is Positive, Negative or Objective. Each ternary classifier differs from other in two perspectives, first the training data used, secondly, the learner. Thus each ternary classifier produces different classification results for a synset. The final opinion score is calculated using the normalization of scores from all the classifiers.

Study Plan

This paper uses semi-CRF for the labeling task. So the user should first read about semi-CRF.