Difference between revisions of "Chang and Blei, AOAS2010"

From Cohen Courses
Jump to navigationJump to search
Line 1: Line 1:
 
== Citation ==
 
== Citation ==
J. Chang and D. Blei.   Hierarchical relational models for document networks.   Annals of Applied Statistics, 4(1):124–150, 2010
+
J. Chang and D. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4(1):124–150, 2010
  
 
== Online version ==
 
== Online version ==
Line 10: Line 10:
 
   - citation networks of documents
 
   - citation networks of documents
 
   - hyperlinked networks of web pages
 
   - hyperlinked networks of web pages
 +
 
* “Predictive Models”
 
* “Predictive Models”
 
   - point social network members toward new friends
 
   - point social network members toward new friends
 
   - point scientific papers toward relevant citations
 
   - point scientific papers toward relevant citations
 
   - point web pages toward other related pages
 
   - point web pages toward other related pages
 +
 
* “Descriptive statistics”
 
* “Descriptive statistics”
 
   - uncover the hidden community structure
 
   - uncover the hidden community structure
Line 20: Line 22:
 
1. For each document <math>d</math>:
 
1. For each document <math>d</math>:
  
   (a) Draw topic proportions <math>\theta_d|\alpha \approx Dir(\alpha)</math>
+
   (a) Draw topic proportions <math>\theta_d|\alpha \sim Dir(\alpha)</math>
 
 
 
   (b) For each word <math>w_{d,n}</math>:
 
   (b) For each word <math>w_{d,n}</math>:
 +
      i.  Draw assignment <math>z_{d,n}|\theta_d \sim Mult(\theta_d)</math>.
 +
      ii. Draw word <math>w_{d,n}|z_{d,n},\beta_{1:K} \sim Mult(\beta_{z_{d,n}})</math>.
  
      i. Draw assignment <math>z_{d,n}|\theta_d \approx Mult(\theta_d)</math>.
+
2. For each pair of documents <math>d</math>,<math>d'</math>:
  
      ii. Draw word <math>w_{d,n}|z_{d,n},\beta_{1:K} \approx Mult(\beta_{z_{d,n}})</math>.
+
  (a) Draw binary link indicator 
 +
              <math>y_{d,d'}|z_d,z_{d'} \sim \psi(.|z_d,z_{d'},\eta)</math>
 +
      where <math>z_d = \{z_{d,1},z_{d,2},...,z_{d,n}\}</math>
  
2. For each pair of documents <math>d</math>,<math>d'</math>:
+
== Inference, Estimation and Prediction ==
 +
* Inference
 +
* Estimation
 +
* Prediction
 +
  - Link prediction from words
 +
    <math>p(y_{d,d'}|w_d, w_{d'}) \approx E_q [p(y_{d,d'}|\bar{z_d},\bar{z_{d'}})]</math>
 +
  - Words prediction from link
 +
    <math>p(w_{d,i}|y_d) \approx E_q [p(w_{d,i}|z_{d,i})]</math>
  
  (a) Draw binary link indicator
+
== Data ==
   
+
* Cora: abstracts + citation link
              <math>y_{d,d'}|z_d,z_{d'} \approx \psi(.|z_d,z_{d'},\eta)</math>
+
* WebKB: web pages + hyperlinks
 +
* PNAS: abstracts + intra-PNAS citation
 +
* LocalNews: local news of each state in U.S + geographical adjacency
  
      where <math>z_d = \{z_{d,1},z_{d,2},...,z_{d,n}\}</math>
+
== Results ==

Revision as of 12:23, 24 February 2011

Citation

J. Chang and D. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4(1):124–150, 2010

Online version

D.Blei's papers

Motivation

  • Network data
 - social networks of friends
 - citation networks of documents
 - hyperlinked networks of web pages
  • “Predictive Models”
 - point social network members toward new friends
 - point scientific papers toward relevant citations
 - point web pages toward other related pages
  • “Descriptive statistics”
 - uncover the hidden community structure

Methodology

1. For each document :

 (a) Draw topic proportions 
 (b) For each word :
     i.  Draw assignment .
     ii. Draw word .

2. For each pair of documents ,:

 (a) Draw binary link indicator   
              
     where 

Inference, Estimation and Prediction

  • Inference
  • Estimation
  • Prediction
 - Link prediction from words
   
 - Words prediction from link
   

Data

  • Cora: abstracts + citation link
  • WebKB: web pages + hyperlinks
  • PNAS: abstracts + intra-PNAS citation
  • LocalNews: local news of each state in U.S + geographical adjacency

Results