Product Feature Extraction and Sentiment Analysis in Product Reviews

From Cohen Courses
Jump to navigationJump to search

Comments

  • I liked the idea behind this proposal of finding out important features for a product type and then collecting sentiment about each feature which contributes to overall sentiment about the product.
  • Natalie Glance has done some work on product reviews which might be interesting to you : [1]
    • Thanks, we read this paper. It solves a different problem, ranking the products by aggregating the reviews.
  • Evaluation of extracted features and feature specific sentiments is going to be challenging. Do you have data from Hu and Liu's (2004) or are you going to implement their techniques and then compare the two sets of results?

--Bbd 02:19, 11 October 2012 (UTC)

Team Members

Project Title

Product Feature Extraction and Sentiment Analysis in Product Reviews

Project Abstract

In this project, we plan to work on product reviews of various product classes and analyze them for finding the product features and opinion of various customers about those features. Using this analysis we aim to identify feature-wise good and bad aspects of a given product. This can be a useful practical solution to allow customers to help decide how well a product satisfies his/her needs if they are only looking for few important features in a product and don't care about other features.
We plan to use the product information available on www.amazon.com to learn common features for a given product type to help reduce redundancy. Also, we aim to analyze the performance of different feature extraction techniques on different product classes. Opinion mining for specific product features would require some level of semantic understanding to separate opinions about other features mentioned in the same review. Also same word can be used to express contrasting opinions, which must be taken into account to avoid incorrect sentiment classification if only a global polarity is used for each sentiment word.
Based on these challenges, we aim to achieve a robust solution for extracting features and opinion about them from the product reviews.

Task

In this project, given reviews for a product, we essentially identify its features/attributes and then extract opinion about those product features and classify them as positive or negative. We will analyze the opinion expressed for each product feature and observe the good and bad features from a customer's perspective. We will also obtain a list of commonly used features for a given product type.

Data

Amazon product reviews data set -

  1. http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#datasets
  2. Amazon product review dataset for various classes
  3. Sentence Level Sentiment dataset for product reviews in various domains

Apart from that, we have already built our own web crawlers and extracted some more product reviews from www.amazon.com to train our system.

Evaluation

  • Evaluation of product feature extraction can be done in comparison with another system such as Hu and Liu's (2004) feature extraction system on the same dataset. Feature extraction can also be evaluated qualitatively to observe whether the most useful features of a product are obtained or not. For this we can manually create list of useful features for target products.
  • The available datasets have an annotated test set for sentiment analysis which can be used for evaluation of opinion mining task. We can also compare the performance with other systems (Related Papers) using the same dataset.

Baseline

We plan to use the following baselines -

  • Product Feature Extraction: We can use an n-gram model to extract noun phrases/words which are usually candidate features for a product. We can also add Wordnet sysnset data to expand on the list of candidate features and also put an appropriate frequency threshold to discard unimportant features.
  • Sentiment Analysis: We can use a list of sentiment words already marked as positive and negative and then score each sentence as positive or negative or neutral based on presence of these words in that sentence.

Challenges

  • Some of the product classes do not have well-defined features like movies, books etc. For such classes, we need to identify implicit features based on what customers liked/disliked about it, which could be something like the movie plot, specific actor's performance etc.
  • Currently available datasets usually label an entire review as positive or negative opinion about a product. But even a positive-labeled review can contain negative opinion about one or more product features. This needs to be considered when mining opinion for individual features and evaluating results.
  • It would interesting to identify the phrases within a sentence with positive or negative opinion rather than tagging the whole sentence as positive or negative.

Related Papers