Difference between revisions of "Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews"

Revision as of 21:09, 1 October 2012

Citation

 author    = {Jianxing Yu, Zheng-Jun Zha, Meng Wang, Kai Wang, Tat-Seng Chua},
 title     = {Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews},
 booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
 month     = {July},
 year      = {2011},
 pages     = {140--150},

Online version

ACLWEB 2011

Summary

This paper proposes to hierarchically organize consumer reviews according to an aspect hierarchy. An aspect is a product feature or attribute that concern the users. For example, an Iphone has following aspects: battery, OS, size, look & feel, weight, camera, memory, applications etc. Therefore, given the consumer reviews of a product, if A = { $a_{1}$ , · · · , $a_{k}$ } denotes the product aspects commented in the reviews. And $H^{0}$ ( $A^{0}$ , $R^{0}$ ) denotes the initial hierarchy derived from domain knowledge where $A^{0}$ is the initial set of aspects and $R^{0}$ is the relations between them. The first objective of this paper is to construct an aspect hierarchy H(A,R), that covers all the aspects in A and their parent-child relations R. Secondly, cluster the review under aspects. Finally, identify the implicit aspects from product reviews and cluster them under respective aspects.

Dataset

The corpus is crawled by authors from the prevalent forums such as cnet.com, viewpoints.com, reevoo.com and gsmarena.com. It contains 11 products in four domain as shown in table 1. The initial aspect hierarchy was made gold standard with the help of human annotators.

For semantic learning they have collected 50 hierarchies from WordNet and ODP as shown in table 2.

Background

An aspect hierarchy is defined as a tree that consists of a set of unique aspects A = { $a_{1}$ , · · · , $a_{k}$ } and a set of parent-child relations R between these aspects.

Methodology

The proposed approach has four components for aspect hierarchy generation.

1.Initial Hierarchy Acquisition :

Product aspects are extracted from web documents and an initial aspect hierarchy is generated using the approach described by Ye and Chua 2006.

2.Aspect Identification in Customer Reviews :

The authors assume that noun phrases are good candidates for aspects. Therefore they leverage the pros and con reviews ( contains explicit product pros and cons description) by extracting noun phrases from them and use them as the training data for a single class SVM classifier. This classifier is then used to test the noun phrases extracted from candidate customer reviews.

3.Semantic Distance Learning : Use the following semantic distance metric to measure the distance between two aspects, $a_{x},a_{y}$ . $d(a_{x},a_{y})=\sum _{j}w_{j}f_{j}(a_{x},a_{y})$ . Where $f_{j}$ is $j^{th}$ feature function. The features and their functions are defined as follows:

Linguistic Features :
- Contextual feature : KL-divergence score between unigram language model of two aspects.
  - Global contextual feature : The language model is build on document containing the aspect.
  - Local contextual feature : The language model is build using only two words from each side of the aspect.
- Co-occurrence feature : It is Pointwise Mutual Information score. It can be built at document level, sentence level or using Google document count.
- Syntactic feature : Average distance between two aspects in a syntactic tree built using Stanford parser.
- Pattern feature : It's 1, if the two aspects match any of the 46 patterns. 40 part-of relations Girju et al., 2006 and 6 hypernym relations Hearst, 1992.
- Lexical feature : Length difference feature, difference in aspect word length. Definition overlap feature, count of word overlapping in Google definitions of aspects.

The weight parameters in the previous equation are learnt using the following optimization problem:

$argmin_{w}||d-f^{T}w||^{2}+\eta ||w||^{2}$

where vector d is the ground-truth distance of all the aspect pairs. And the ground-truth distance between two aspects is generated by summing up all the edge distances along the shortest path between $a_{x}anda_{y}$ , in the initial hierarchy and every edge weight is assumed as 1.

f is the feature vector for corresponding pair. $\eta$ is the tradeoff parameter.

The optimal solution for w in the above equation is defined as

$w^{\star }=(f^{T}f+\eta I)^{-1}(f^{T}d)$

The above learning algorithm can perform well when sufficient training data is available. Since the initial hierarchy are too coarse the author uses the WordNet and OpenDirectory Project hierarchies to learn $w_{0}$ . And $w_{0}$ is used to assist learning the optimal distance metric from initial hierarchy. This can be represented as the following problem.

$w^{\star }=(f^{T}f+(\eta +\gamma )I)^{-1}(f^{T}d+\gamma w_{0})$

Where $\eta$ and $\gamma$ are tradeoff parameters.

4.Aspect Hierarchy Generation Aspects, A = { $a_{1}$ , · · · , $a_{k}$ } identified from the step 2 are then inserted one by one into initial $H^{0}$ ( $A^{0}$ , $R^{0}$ ).

The insertion is done considering the following information function and set of rules for optimizing the resulting hierarchy.

  ${\text{Info(H(A,R))= }}\sum _{x<y;a_{x},a_{y}\in A}d(a_{x},a_{y})$ .

i.Minimum hierarchy evolution : The optimal hierarchy $H^{(i+1)}$ introduces the least changes of information $H^{i}$ . Optimize the following objective function

 $obj_{1}=argmin_{H^{(i+1)}}(\sum _{x<y;a_{x},a_{y}\in A_{i}\cup {a}}d(a_{x},a_{y})-\sum _{x<y;a_{x},a_{y}\in A_{i}}d(a_{x},a_{y}))^{2}$ .

ii.Minimum hierarchy discrepancy : A good hierarchy should bring least changes to initial hierarchy.

 $obj_{2}=argmin_{H^{(i+1)}}{\frac {1}{i+1}}(\sum _{x<y;a_{x},a_{y}\in A_{i}\cup {a}}d(a_{x},a_{y})-\sum _{x<y;a_{x},a_{y}\in A_{0}}d(a_{x},a_{y})))^{2}$

iii.Minimum semantic inconsistency : Semantic distance estimated from hierarchy should be approximate to that calculated from feature function.

 $obj_{3}=argmin_{H^{(i+1)}}\sum _{x<y;a_{x},a_{y}\in A_{i}\cup {a}}(d^{H}(a_{x},a_{y})-d(a_{x},a_{y}))^{2}$

Final objective function is defined using $obj_{1},obj_{2},obj_{3}$

 $obj=argmin_{H^{(i+1)}}(\lambda _{1}\star obj_{1}+\lambda _{2}\star obj_{2}+\lambda _{3}\star obj_{3})$  where  $\lambda _{1}+\lambda _{2}+\lambda _{3}=1;$  and  $0\leq \lambda _{1},\lambda _{2},\lambda _{3}\leq 1$ .

Review Organization

Based on the final hierarchy the customer reviews are organized under their corresponding aspect. The aspect nodes are pruned and sentiment classification is done on reviews under given aspect.

Implicit Aspect Identification : The author assumes that implicit aspect reviews use same sentiment terms for same aspect Su et al, 2008. Therefore a customer review is represented by a vector of sentiment terms. Following this calculate the average feature vector for each aspect and then allocate each implicit aspect review to its nearest aspect node.

Experiment Result

Aspect Identification
- The proposed approach significantly outperforms state of art, RelatedPaper::Hu and Liu, 2004 and Wu et al., 2009, work in terms of $F_{1}-measure$ by 5.87% and 3.27% respectively.
Aspect Hierarchy
- The results show that pattern-based, Hearst, 1992, and clustering-based,Shi et al., 2008 methods perform poor. The proposed method leverages external hierarchies to derive reliable semantic distance between aspects and thus outperforms Snow et al., 2006 and Yang and Callan 2009.
- Using initial hierarchy the proposed approach outperforms pattern-based, clustering-based, Snow's and Yang's method by 49.4%, 51.2%, 34.3% and 4.7% respectively.
- Domain knowledge is important in aspect hierarchy generation as it is seen that $F_{1}-measure$ increases with larger size of initial hierarchy.
- All three optimization criteria are important.
- All the features and external hierarchies are important. External features boost $F_{1}-measure$ by 2.81%.
Implicit Aspect Identification
- The authors have used mutual clustering, Su et al, 2008, as the base line and shown that the proposed approach is 9.18% better in terms of average $F_{1}-measure$ .

Related Paper

Ye and Chua 2006.
- Learn how to create aspect hierarchy by parsing information from webpages.
Hu and Liu, 2004

@@ Line 35: / Line 35: @@
 .'''Aspect Identification in Customer Reviews :'''
-The authors assume that noun phrases are good candidates for aspects. Therefore they leverage the pros and con reviews ( contains explicit product pros and cons description) by extracting noun phrases from them and use them as the training data for a single class SVM classifier. This classifier is then used to test the noun phrases extracted from candidate customer reviews.
+The authors assume that noun phrases are good candidates for aspects. Therefore they leverage the pros and con reviews ( contains explicit product pros and cons description) by extracting noun phrases from them and use them as the training data for a single class [http://en.wikipedia.org/wiki/Support_vector_machine SVM] classifier. This classifier is then used to test the noun phrases extracted from candidate customer reviews.
 .'''Semantic Distance Learning :''' Use the following semantic distance metric to measure the distance between two aspects, <math>a_x, a_y</math>.

Difference between revisions of "Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews"

Revision as of 21:09, 1 October 2012

Contents

Citation

Online version

Summary

Dataset

Background

Methodology

Experiment Result

Related Paper

Study Plan

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools