Difference between revisions of "Naive Bayes classifier learning"
Line 4: | Line 4: | ||
A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. | A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. | ||
− | + | [[File:f1.PNG]] | |
Here, we use an example from sentiment analysis on twitter messages. So we let s be the sentiment label, M be the Twitter message. If we assume there are equal sets of positive, negative and neutral messages, we simplify the equation: | Here, we use an example from sentiment analysis on twitter messages. So we let s be the sentiment label, M be the Twitter message. If we assume there are equal sets of positive, negative and neutral messages, we simplify the equation: | ||
Line 13: | Line 13: | ||
Finally, we have the log-likelihood of each sentiment: | Finally, we have the log-likelihood of each sentiment: | ||
− | |||
== Applications == | == Applications == |
Revision as of 20:39, 31 March 2011
This is a method discussed in Social Media Analysis 10-802 in Spring 2010 and Social Media Analysis 10-802 in Spring 2011.
Background
A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions.
Here, we use an example from sentiment analysis on twitter messages. So we let s be the sentiment label, M be the Twitter message. If we assume there are equal sets of positive, negative and neutral messages, we simplify the equation:
If we re-write M into a set of features and assume they are conditionally independent, we have:
Finally, we have the log-likelihood of each sentiment:
Applications
It is widely used in information retrieval and information extraction, for example, Document Categorization, Text Classification and many different problems.
Relevant Papers
AddressesProblem | UsesDataset | |
---|---|---|
Pang et al EMNLP 2002 | Review classification | Pang Movie Reviews |