Opinion mining
From Cohen Courses
Revision as of 14:34, 1 December 2010 by PastStudents (talk | contribs)
Summary
Opinion mining is a problem in the field of information extraction that which aims to automatically extract opinion expressions from product reviews. Also one of the goal of the opinion mining techniques is to determine the opinion direction of a review.
Common Approaches
Generally there are two approaches for opinion mining: 1- document level and 2- feature level opinion mining.
- Document level
- Turney,2002 presented an approach to calculate the opinion orientation using the Web as a corpus. The input review is classified based on the average semantic orientation of the phrases in the review. The have used PMI-IR technique to measure the semantic orientation of each phrase in the review.
- Turney and Littman, 2003 expanded Turney,2002 work using cosine distance in latent semantic analysis as the distance measure.
Some common models for named entity recognition include the following:
- Lexicons
- Checks if a token is part of a predefined set
- Classifying pre-segmented candidates
- Manually select candidates, then use YFCL on a piece of text to deterimine what type of entity it is
- Sliding Window
- Try all reasonable token windows (different lengths and positions), train a Naive Bayes classifier or YFCL, then extract text if Pr(class=+|prefix, contents, suffix) > some threshold
- Token Tagging / Sequential
- Classify tokens sequentially, with models like Hidden Markov Models, Maximum Entropy Markov Models, or Conditional Random Fields.
Example Systems
References / Links
- BBN Named Entity Types - [1]
- Satoshi Sekine's Extended Named Entity Hierarchy - [2]
- Wikipedia page on Named entity recognition - [3]