Difference between revisions of "Midterm Report Nitin Yandong Ming Yanbo"

Revision as of 00:31, 17 March 2011

Team members

Nitin Agarwal

Yandong Liu

Yanbo Xu

Ming Sun

LDA results

ATM results

Gibbs Sampling for Collaboration Influence Model

We want $P(Z,X,R|W)$ , the posterior distribution of topic Z, (author, collaborator) pair X and which favor of collaboration over influence R given the words W in the corpus:

$P(Z,X,R|W)={\frac {P(Z,X,R,W)}{\sum _{Z,X,R}P(Z,X,R,W)}}$

We begin by calculating $P(W|Z,X,R)$ and $P(Z,X,R)$ :

$P(W|Z,X,R)=P(W|Z)=\prod _{z=1}^{T}({\frac {\Gamma (\sum _{v=1}^{V}\beta _{v})}{\prod _{v=1}^{V}\Gamma (\beta _{v})}}({\frac {\prod _{v=1}^{V}\Gamma (n_{z}^{w_{v}}+\beta _{v})}{\Gamma (\sum _{v=1}^{V}\beta _{v}+\sum _{v=1}^{V}n_{z}^{w_{v}})}}))$

$P(Z,X,R)=(\prod _{i_{w}=1}^{W}{\frac {1}{n_{r_{i_{w}}}(a_{i_{w}})+\eta _{r_{i_{w}}}}})\prod _{p=1}^{P}({\frac {\Gamma (\sum _{z}\alpha _{z})}{\prod _{z=1}^{T}\Gamma (\alpha _{z})}}{\frac {\prod _{z}\Gamma (n_{p}^{z}+\alpha _{z})}{\Gamma (\sum _{z}\alpha _{z}+\sum _{z}n_{p}^{z})}})$ ,

where P is the number of all the different author-collaborator-favor of collaboration combination (a,a',r).

So the Gibbs sampling of $P(z_{i},x_{i},r_{i},w_{i}|Z_{-i},X_{-i},R_{-i},W_{-i})$ :

$P(z_{i},x_{i},r_{i},w_{i}|Z_{-i},X_{-i},R_{-i},W_{-i})$

$={\frac {P(Z,X,R,W)}{P(Z_{-i},X_{-i},R_{-i},W_{-i})}}$

$={\frac {1}{n_{r_{i}}+\eta _{r_{i}}}}{\frac {n_{p,-i}^{t}+\alpha _{t}}{\sum _{z}n_{p,-i}^{z}+\sum _{z}\alpha _{z}}}{\frac {n_{t,-i}^{w_{v}}+\beta _{v}}{\sum _{v}n_{t,-i}+\sum _{v}\beta _{v}}}$

Further manipulation can turn the above equation into update equations for the topic and author-collaboration of each corpus token:

$P(z_{i}|Z_{-i},X,W,R)\propto {\frac {n_{z_{i}}^{w_{v}}+\beta _{v}}{\sum _{v}n_{z_{i}}^{w_{v}}+\beta _{v}}}{\frac {n_{x_{i}}^{z_{i}}+\alpha _{z_{i}}}{\sum _{z'}n_{x_{i}}^{z'}+\alpha _{z'}}}{\frac {n_{r_{i}}+\eta _{r_{i}}}{\sum _{r_{i}}(n_{r_{i}}+\eta _{r_{i}})}}$

$P(x_{i},r_{i}|Z,X_{-i},W,R_{-i})\propto {\frac {n_{x_{i},r_{i}}^{z_{i}}+\alpha _{z_{i}}}{\sum _{z'}n_{x_{i},r_{i}}^{z'}+\alpha _{z'}}}{\frac {n_{r_{i}}+\eta _{r_{i}}}{\sum _{r_{i}}(n_{r_{i}}+\eta _{r_{i}})}}$

@@ Line 18: / Line 18: @@
 <math>P(Z,X,R|W) = \frac{P(Z,X,R,W)}{\sum_{Z,X,R} P(Z,X,R,W)}</math>
-We begin by calculating <math>P(W|Z,X,R)</math>:
+We begin by calculating <math>P(W|Z,X,R)</math> and <math>P(Z,X,R)</math>:
 <math>P(W|Z,X,R) = P(W|Z) = \prod_{z = 1}^{T} (\frac{\Gamma (\sum_{v = 1}^{V} \beta_{v})}{\prod_{v=1}^{V} \Gamma (\beta_v)} ( \frac{\prod_{v=1}^{V} \Gamma (n_{z}^{w_v} + \beta_v)}{\Gamma (\sum_{v=1}^{V} \beta_v + \sum_{v=1}^{V} n_{z}^{w_v})}))</math>
-then,
-<math>P(Z,X,R) = ()()</math>
+<math>P(Z,X,R) = (\prod_{i_w = 1}^{W} \frac{1}{n_{r_{i_w}} (a_{i_w}) + \eta_{r_{i_w}}}) \prod_{p=1}^{P} (\frac{\Gamma (\sum_z \alpha_z)}{\prod_{z=1}^{T} \Gamma (\alpha_z)} \frac{\prod_z \Gamma (n_p^z + \alpha_z)}{\Gamma (\sum_z \alpha_z + \sum_z n_p^z)})</math>,
+where P is the number of all the different author-collaborator-favor of collaboration combination (a,a',r).
+So the Gibbs sampling of <math>P(z_i, x_i, r_i, w_i | Z_{-i}, X_{-i}, R_{-i}, W_{-i})</math> :
+<math>P(z_i, x_i, r_i, w_i | Z_{-i}, X_{-i}, R_{-i}, W_{-i})</math>
+<math>= \frac{P(Z,X,R,W)}{P(Z_{-i}, X_{-i}, R_{-i}, W_{-i})}</math>
+<math>= \frac{1}{n_{r_i} + \eta_{r_i}} \frac{n_{p,-i}^{t} + \alpha_t}{\sum_z n_{p,-i}^z + \sum_z \alpha_z} \frac{n_{t,-i}^{w_v} + \beta_v}{\sum_v n_{t,-i} + \sum_v \beta_v}</math>
+Further manipulation can turn the above equation into update equations for the topic and author-collaboration of each corpus token:
+<math>P(z_i | Z_{-i}, X, W,R) \propto \frac{n_{z_i}^{w_v} + \beta_v}{\sum_v n_{z_i}^{w_v} + \beta_v} \frac{n_{x_i}^{z_i} + \alpha_{z_i}}{\sum_{z'} n_{x_i}^{z'} + \alpha_{z'}} \frac{n_{r_i} + \eta_{r_i}}{\sum_{r_i} (n_{r_i} + \eta_{r_i})}</math>
+<math>P(x_i,r_i | Z,X_{-i},W,R_{-i}) \propto \frac{n_{x_i, r_i}^{z_i} +\alpha_{z_i}}{\sum_{z'} n_{x_i,r_i}^{z'} + \alpha_{z'}} \frac{n_{r_i} + \eta_{r_i}}{\sum_{r_i} (n_{r_i} + \eta_{r_i})}</math>

Difference between revisions of "Midterm Report Nitin Yandong Ming Yanbo"

Revision as of 00:31, 17 March 2011

Contents

Team members

LDA results

ATM results

Gibbs Sampling for Collaboration Influence Model

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools