Product Feature Extraction and Sentiment Analysis in Product Reviews
Contents
Comments
- I liked the idea behind this proposal of finding out important features for a product type and then collecting sentiment about each feature which contributes to overall sentiment about the product.
- Natalie Glance has done some work on product reviews which might be interesting to you : [1]
- Thanks, we read this paper. It solves a different problem, ranking the products by aggregating the reviews.
- Evaluation of extracted features and feature specific sentiments is going to be challenging. Do you have data from Hu and Liu's (2004) or are you going to implement their techniques and then compare the two sets of results?
- Yes, we have the dataset from Hu and Liu 2004, Amazon product review dataset for various classes, which we can use to evaluate our feature extraction process. --ydalal 19:19, 15 October, 2012 (UTC).
--Bbd 02:19, 11 October 2012 (UTC)
Team Members
Project Title
Product Feature Extraction and Sentiment Analysis in Product Reviews
Project Abstract
In this project, we plan to work on product reviews of various product classes and analyze them for finding the product features and opinion of various customers about those features. Using this analysis we aim to identify feature-wise good and bad aspects of a given product. This can be a useful practical solution to allow customers to help decide how well a product satisfies his/her needs if they are only looking for few important features in a product and don't care about other features.
We plan to use the product information available on www.amazon.com to learn common features for a given product type to help reduce redundancy. Also, we aim to analyze the performance of different feature extraction techniques on different product classes. Opinion mining for specific product features would require some level of semantic understanding to separate opinions about other features mentioned in the same review. Also same word can be used to express contrasting opinions, which must be taken into account to avoid incorrect sentiment classification if only a global polarity is used for each sentiment word.
Based on these challenges, we aim to achieve a robust solution for extracting features and opinion about them from the product reviews.
Task
In this project, given reviews for a product, we essentially identify its features/attributes and then extract opinion about those product features and classify them as positive or negative. We will analyze the opinion expressed for each product feature and observe the good and bad features from a customer's perspective. We will also obtain a list of commonly used features for a given product type.
Data
Amazon product reviews data set -
- http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#datasets
- Amazon product review dataset for various classes
- Sentence Level Sentiment dataset for product reviews in various domains
Apart from that, we have already built our own web crawlers and extracted some more product reviews from www.amazon.com to train our system.
Evaluation
- Evaluation of product feature extraction can be done in comparison with another system such as Hu and Liu's (2004) feature extraction system on the same dataset. Feature extraction can also be evaluated qualitatively to observe whether the most useful features of a product are obtained or not. For this we can manually create list of useful features for target products.
- The available datasets have an annotated test set for sentiment analysis which can be used for evaluation of opinion mining task. We can also compare the performance with other systems (Related Papers) using the same dataset.
Baseline
We plan to use the following baselines -
- Product Feature Extraction: We can use an n-gram model to extract noun phrases/words which are usually candidate features for a product. We can also add Wordnet sysnset data to expand on the list of candidate features and also put an appropriate frequency threshold to discard unimportant features.
- Sentiment Analysis: We can use a list of sentiment words already marked as positive and negative and then score each sentence as positive or negative or neutral based on presence of these words in that sentence.
Challenges
- Some of the product classes do not have well-defined features like movies, books etc. For such classes, we need to identify implicit features based on what customers liked/disliked about it, which could be something like the movie plot, specific actor's performance etc.
- Currently available datasets usually label an entire review as positive or negative opinion about a product. But even a positive-labeled review can contain negative opinion about one or more product features. This needs to be considered when mining opinion for individual features and evaluating results.
- It would interesting to identify the phrases within a sentence with positive or negative opinion rather than tagging the whole sentence as positive or negative.
Related Papers
- Pang, B., L. Lee, and S. Vaithyanathan. 2002. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, 79–86.
- Turney, P. D. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 417–424.
- M. Hu and B. Liu. Mining Opinion Features in Customer Reviews. In Proceedings of Nineteenth National Conference on Artificial Intelligence. 2004.
- Dave, K., Lawrence, S., and Pennock, D.M. 2003. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. WWW 2003.
- Jianxing Yu , Zheng-Jun Zha, Meng Wang, Kai Wang, Tat-Seng Chua, Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews, EMNLP 2011.
- Bishan Yang and Claire Cardie, Extracting Opinion Expressions with semi-Markov Conditional Random Fields.