Popescu and Etzioni , EMNLP 2005

From Cohen Courses
Revision as of 23:51, 1 October 2012 by Sushantk (talk | contribs)
Jump to navigationJump to search

This is a summary of research paper as part of Social Media Analysis 10-802, Fall 2012.

Citation

Ana-Maria Popescu , Oren Etzioni, Extracting product features and opinions from reviews, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.339-346, October 06-08, 2005, Vancouver, British Columbia, Canada.

Online Version

Direct PDF link

Abstract from the paper

Consumers are often forced to wade through many on-line reviews in order to make an informed product choice. This paper introduces OPINE, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across products.
Compared to previous work, OPINE achieves 22% higher precision (with only 3% lower recall) on the feature extraction task. OPINE’s novel use of relaxation labeling for finding the semantic orientation of words in context leads to strong performance on the tasks of finding opinion phrases and their polarity.

Summary

Overview

This paper proposes various methods for opinion mining and classification of product reviews as positive or negative for specific product features. The paper describes four main sub-problems to deal with -

  1. Identifying product features/attributes
  2. Mining opinions about product features
  3. Determining opinion polarity
  4. Ranking opinions based on their strength

In order to solve the above sub tasks, this paper introduces OPINE, an unsupervised review mining system, built on top of the KnowItAll web information extraction (IE) system. In this paper, the authors mainly discuss about the first three sub tasks.

OPINE System

The OPINE system proposed in this paper helps in extracting features and opinion phrases describing these features for different product classes. It is built on top of KnowItAll IE system which uses point-wise mutual information between the candidate facts for a given relation and the automatically generated discriminator phrases. The PMI scores used with Naive Bayes classifier give the probability associated with each fact.

Finding Explicit Features

OPINE recursively extracts, for each product class, its parts and properties and then their parts and properties until no more candidates are available. It uses the PMI scores from KnowItAll system to identify candidate noun phrases for parts and uses WordNet and morphological analysis to identify more parts and properties using relations.

Sentiment Classification

Evaluation

Finding Explicit Features

The authors use 7 different product classes and compare OPINE with Hu and Liu, 2004 opinion mining system. OPINE shows an improvement of 22% in precision as compared to Hu's system while there is a drop of 3% in the recall. They analyzed that OPINE gains around 6% precision by using PMI assessment of reviews and another 14% by using Web PMI statistics from KnowItAll.

Discussion

Related Papers

Study Plan

Resources useful for understanding this paper

  • Article: Opinion Mining
  • KnowItAll [O. Etzioni, M. Cafarella, D. Downey, S. Kok, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1):91–134.]