# Pointwise mutual information

This is a method discussed in Social Media Analysis 10-802 in Spring 2010.

If X and Y are random variables, the pointwise mutual information between two possible outcomes X=x and Y=y is

${\displaystyle PMI(x,y)=\log {\frac {Pr(X=x,Y=y)}{Pr(X=x)Pr(Y=y)}}.}$

This quantity is zero if x and y are independent, positive if they are positively correlated, and negative if they are negatively correlated.

In Turney, ACL 2002 this was used as a way of assessing the semantic orientation of words or phrases. Specifically the semantic orientation of x was defined as

${\displaystyle SO(x)=PMI(x,{\textit {'excellent'}})-PMI(x,{\textit {'poor'}})}$

In more detail, Turney interpreted "X=x and Y=y" as an event where two words x and y occur nearby in the same document, and "X=x" as an event where word x occurs in a document. After some simplification, SO(x) can then be written as

${\displaystyle \log {\frac {{\textit {Hits(x~near~'excellent')}}\cdot {\textit {Hits('poor')}}}{{\textit {Hits(x~near~'poor')}}\cdot {\textit {Hits('excellent')}}}}}$

This means that SO(x) can be computed quickly - with just two queries to a search engine.