Difference between revisions of "Naive Bayes classifier learning"
m (1 revision: Naive Bayes page - missed in first import) |
|||
Line 1: | Line 1: | ||
− | This is a [[category::method]] discussed in [[Social Media Analysis 10-802 in Spring 2010]]. | + | This is a [[category::method]] discussed in [[Social Media Analysis 10-802 in Spring 2010]] and [[Social Media Analysis 10-802 in Spring 2011]]. |
+ | |||
+ | == Background == | ||
+ | A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. | ||
+ | |||
+ | |||
+ | |||
+ | Here, we use an example from sentiment analysis on twitter messages. So we let s be the sentiment label, M be the Twitter message. If we assume there are equal sets of positive, negative and neutral messages, we simplify the equation: | ||
+ | |||
+ | |||
+ | If we re-write M into a set of features and assume they are conditionally independent, we have: | ||
+ | |||
+ | |||
+ | Finally, we have the log-likelihood of each sentiment: | ||
+ | |||
+ | |||
+ | == Applications == | ||
+ | It is widely used in information retrieval and information extraction, for example, Document Categorization, Text Classification and many different problems. | ||
== Relevant Papers == | == Relevant Papers == |
Revision as of 20:35, 31 March 2011
This is a method discussed in Social Media Analysis 10-802 in Spring 2010 and Social Media Analysis 10-802 in Spring 2011.
Background
A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions.
Here, we use an example from sentiment analysis on twitter messages. So we let s be the sentiment label, M be the Twitter message. If we assume there are equal sets of positive, negative and neutral messages, we simplify the equation:
If we re-write M into a set of features and assume they are conditionally independent, we have:
Finally, we have the log-likelihood of each sentiment:
Applications
It is widely used in information retrieval and information extraction, for example, Document Categorization, Text Classification and many different problems.
Relevant Papers
AddressesProblem | UsesDataset | |
---|---|---|
Pang et al EMNLP 2002 | Review classification | Pang Movie Reviews |