Sentiment analysis

From Cohen Courses
Revision as of 22:30, 31 March 2011 by Bolin (talk | contribs)
Jump to navigationJump to search

This is a problem discussed in Social Media Analysis 10-802 in Spring 2011.

Introduction

Sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.

Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall tonality of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).

Methods

Computers can perform automated sentiment analysis of digital texts, using elements from machine learning such as latent semantic analysis, support vector machines, "bag of words" and Semantic Orientation — Pointwise Mutual Information (See Peter Turney's [2] work in this area). More sophisticated methods try to detect the holder of a sentiment (i.e. the person who maintains that affective state) and the target (i.e. the named entity or target whose affective state one is interested in) [13]. To mine the opinion in context and get the feature which has been opinionated, the grammatical relationships of words are used. Grammatical dependency relations are obtained by deep parsing of the text [14].

In Sentic computing [15], a multi-disciplinary approach to opinion mining and sentiment analysis, text processing is not based on statistical learning models but rather on common sense reasoning tools and affective ontologies. Differently from statistical classification, which generally requires large inputs and thus cannot appraise texts with satisfactory granularity, Sentic Computing enables the analysis of documents not only on the page- or paragraph-level but also on the sentence-level.

Open source software tools deploy machine learning, statistics, and natural language processing techniques allowing to automate the sentiment analysis task on large collections of texts like for example web pages, online news, internet discussion groups, online reviews, web blogs, and social media like for example Twitter.