Difference between revisions of "Popescu and Etzioni , EMNLP 2005"

From Cohen Courses
Jump to navigationJump to search
Line 22: Line 22:
 
# Ranking opinions based on their strength
 
# Ranking opinions based on their strength
  
In order to solve the above sub tasks, this paper introduces OPINE, an unsupervised review mining system, built on top of the KnowItAll web information extraction system.
+
In order to solve the above sub tasks, this paper introduces OPINE, an unsupervised review mining system, built on top of the KnowItAll web information extraction (IE) system. In this paper, the authors mainly discuss about the first three sub tasks.
  
=== Proposed Techniques ===
+
=== OPINE System ===
 
+
The OPINE system proposed in this paper helps in extracting features and opinion phrases describing these features for different product classes. It is built on top of KnowItAll IE system which uses [[UsesMethod::Pointwise_mutual_information|point-wise mutual information]] between the candidate facts for a given relation and the automatically generated discriminator phrases. The PMI scores used with Naive Bayes classifier give the probability associated with each fact.
 
 
==== Feature Selection ====
 
  
 +
==== Finding Explicit Features ====
 +
OPINE recursively extracts, for each product class, its parts and properties and then their parts and properties until no more candidates are available. It uses the PMI scores from KnowItAll system to identify candidate noun phrases for parts and uses WordNet and morphological analysis to identify more parts and properties using relations.
  
 
==== Sentiment Classification ====
 
==== Sentiment Classification ====
Line 34: Line 34:
  
 
== Evaluation ==
 
== Evaluation ==
 
+
==== Finding Explicit Features ====
 +
The authors use 7 different product classes and compare OPINE with [http://malt.ml.cmu.edu/mw/index.php/Hu_and_Liu,_AAAI_2004 Hu and Liu, 2004] opinion mining system. OPINE shows an improvement of 22% in precision as compared to Hu's system while there is a drop of 3% in the recall. They analyzed that OPINE gains around 6% precision by using PMI assessment of reviews and another 14% by using Web PMI statistics from KnowItAll.
  
 
== Discussion ==
 
== Discussion ==

Revision as of 00:51, 2 October 2012

This is a summary of research paper as part of Social Media Analysis 10-802, Fall 2012.

Citation

Ana-Maria Popescu , Oren Etzioni, Extracting product features and opinions from reviews, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.339-346, October 06-08, 2005, Vancouver, British Columbia, Canada.

Online Version

Direct PDF link

Abstract from the paper

Consumers are often forced to wade through many on-line reviews in order to make an informed product choice. This paper introduces OPINE, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across products.
Compared to previous work, OPINE achieves 22% higher precision (with only 3% lower recall) on the feature extraction task. OPINE’s novel use of relaxation labeling for finding the semantic orientation of words in context leads to strong performance on the tasks of finding opinion phrases and their polarity.

Summary

Overview

This paper proposes various methods for opinion mining and classification of product reviews as positive or negative for specific product features. The paper describes four main sub-problems to deal with -

  1. Identifying product features/attributes
  2. Mining opinions about product features
  3. Determining opinion polarity
  4. Ranking opinions based on their strength

In order to solve the above sub tasks, this paper introduces OPINE, an unsupervised review mining system, built on top of the KnowItAll web information extraction (IE) system. In this paper, the authors mainly discuss about the first three sub tasks.

OPINE System

The OPINE system proposed in this paper helps in extracting features and opinion phrases describing these features for different product classes. It is built on top of KnowItAll IE system which uses point-wise mutual information between the candidate facts for a given relation and the automatically generated discriminator phrases. The PMI scores used with Naive Bayes classifier give the probability associated with each fact.

Finding Explicit Features

OPINE recursively extracts, for each product class, its parts and properties and then their parts and properties until no more candidates are available. It uses the PMI scores from KnowItAll system to identify candidate noun phrases for parts and uses WordNet and morphological analysis to identify more parts and properties using relations.

Sentiment Classification

Evaluation

Finding Explicit Features

The authors use 7 different product classes and compare OPINE with Hu and Liu, 2004 opinion mining system. OPINE shows an improvement of 22% in precision as compared to Hu's system while there is a drop of 3% in the recall. They analyzed that OPINE gains around 6% precision by using PMI assessment of reviews and another 14% by using Web PMI statistics from KnowItAll.

Discussion

Related Papers

Study Plan

Resources useful for understanding this paper

  • Article: Opinion Mining
  • KnowItAll [O. Etzioni, M. Cafarella, D. Downey, S. Kok, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1):91–134.]