Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews
Contents
Citation
author = {Jianxing Yu, Zheng-Jun Zha, Meng Wang, Kai Wang, Tat-Seng Chua}, title = {Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews}, booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing}, month = {July}, year = {2011}, pages = {140--150},
Online version
Summary
This paper proposes to hierarchically organize consumer reviews according to an aspect hierarchy. An aspect is a product feature or attribute that concern the users. For example, an Iphone has following aspects: battery, OS, size, look & feel, weight, camera, memory, applications etc. Therefore, given the consumer reviews of a product, if A = {, · · · , } denotes the product aspects commented in the reviews. And (,) denotes the initial hierarchy derived from domain knowledge where is the initial set of aspects and is the relations between them. The first objective of this paper is to construct an aspect hierarchy H(A,R), that covers all the aspects in A and their parent-child relations R. Secondly, cluster the review under aspects. Finally, identify the implicit aspects from product reviews and cluster them under respective aspects.
Dataset
The corpus is crawled by authors from the prevalent forums such as cnet.com, viewpoints.com, reevoo.com and gsmarena.com. It contains 11 products in four domain as shown in table 1. The initial aspect hierarchy was made gold standard with the help of human annotators.
For semantic learning they have collected 50 hierarchies from WordNet and ODP as shown in table 2.
Background
An aspect hierarchy is defined as a tree that consists of a set of unique aspects A = {, · · · , } and a set of parent-child relations R between these aspects.
Methodology
The proposed approach has four components
1.Initial Hierarchy Acquisition :
Product aspects are extracted from web documents and an initial aspect hierarchy is generated using the approach described by Ye and Chua (2006).
2.Aspect Identification in Customer Reviews :
The authors assume that noun phrases are good candidates for aspects. Therefore they leverage the pros and con reviews ( contains explicit product pros and cons description) by extracting noun phrases from them and use them as the training data for a single class SVM classifier. This classifier is then used to test the noun phrases extracted from candidate customer reviews.
3.Semantic Distance Learning : Use the following semantic distance metric to measure the distance between two aspects, . . Where is feature function. The features and their function are defined as follows:
- Linguistic Features :
- Contextual feature : KL-divergence score between unigram language model of two aspects.
- Global contextual feature : The language model is build on document containing the aspect.
- Local contextual feature : The language model is build using only two words from each side of the aspect.
- Co-occurrence feature : It is Pointwise Mutual Information score. It can be built at document level, sentence level or using Google document count.
- Syntactic feature : Average distance between two aspects in a syntactic tree built using Stanford parser.
- Pattern feature : It's 1, if the two aspects match any of the 46 patterns. 40 part-of relations Girju et al., 2006 and 6 hypernym relations Hearts, 1992.
- Lexical feature : Length difference feature, difference in aspect word length. Definition overlap feature, count of word overlapping in Google definitions of aspects.
- Contextual feature : KL-divergence score between unigram language model of two aspects.
The weight parameters in the previous equation are learnt using the following optimization problem:
where vector d is the ground-truth distance of all the aspect pairs. And the ground-truth distance between two aspects is generated by summing up all the edge distances along the shortest path between , in the initial hierarchy and every edge weight is assumed as 1.
f is the feature vector for corresponding pair. is the tradeoff parameter.
The optimal solution for w in the above equation is defined as
The above learning algorithm can perform well when sufficient training data is available. Since the initial hierarchy are too coarse the author uses the WordNet and OpenDirectory Project hierarchies to learn . And is used to assist learning the optimal distance metric from initial hierarchy. This can be represented as the following problem.
Where and are tradeoff parameters.
4.Aspect Hierarchy Generation Aspects, A = {, · · · , } identified from the step 2 are then inserted one by one into initial (,).
The insertion is done considering the following information function and set of rules for optimizing the resulting hierarchy.
.
i.Minimum hierarchy evolution : The optimal hierarchy introduces the least changes of information . Optimize the following objective function
- .
ii.Minimum hierarchy discrepancy : A good hierarchy should bring least changes to initial hierarchy.
iii.Minimum semantic inconsistency : semantic distance estimated from hierarchy should be approximate to that calculated from feature function.
Final objective function is defined using
Based on the final hierarchy the customer reviews are organized under their corresponding aspect. The aspect nodes are pruned and sentiment classification is done on reviews under given aspect.
- Implicit Aspect Identification
The author assumes that implicit aspect reviews use same sentiment terms for same aspect paper:Su et al.,2008. Therefore a customer review is represented by a vector of sentiment terms. Following this calculate the average feature vector for each aspect and then allocate each implicit aspect review to its nearest aspect node.
Experiment Result
- Aspect Identification
- The proposed approach significantly outperforms state of art, Hu and Liu, 2004 and Wu et al., 2009, work in terms of by 5.87% and 3.27% respectively.
- Aspect Hierarchy
- The results show that pattern-based, Hearst, 1992, and clustering-based,Shi et al., 2008 methods perform poor. The proposed method leverages external hierarchies to derive reliable semantic distance between aspects and thus outperforms Show et al., 2006 and Yang and Callan 2009.
- Using initial hierarchy the proposed approach outperforms pattern-based, clustering-based, Snow's and Yang's method by 49.4%, 51.2%, 34.3% and 4.7% respectively.
- Domain knowledge is important in aspect hierarchy generation as it is seen that increases with larger size of initial hierarchy.
- All three optimization criteria are important.
- All the features and external hierarchies are important. External features boost by 2.81%.
- Implicit Aspect Identification
- The authors have used mutual clustering, Su et al, 2008, as the base line and shown that the proposed approach is 9.18% better in terms of average .
Related Paper
- Ye and T.-S. Chua. Learning Object Models from Semi-structured Web Documents. IEEE Transactions on Knowledge and Data Engineering, 2006.
- Learn how to create aspect hierarchy by parsing information from webpages.