Difference between revisions of "Chang and Blei, AOAS2010"

Latest revision as of 15:40, 31 March 2011

Citation

J. Chang and D. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4(1):124–150, 2010

Online version

D.Blei's papers

Motivation

For Network data, such as social networks of friends, citation networks of documents or hyperlinked networks of web pages, people want to point social network members toward new friends, scientific papers toward relevant citations or web pages toward other related pages. They also want to uncover the hidden community structure. This paper developed a hierarchical model of both network structure and node attributes, based on Latent Dirichlet Allocation.

Methodology

1. For each document $d$ :

 (a) Draw topic proportions  $\theta _{d}|\alpha \sim Dir(\alpha )$ 
 (b) For each word  $w_{d,n}$ :
     i.  Draw assignment  $z_{d,n}|\theta _{d}\sim Mult(\theta _{d})$ .
     ii. Draw word  $w_{d,n}|z_{d,n},\beta _{1:K}\sim Mult(\beta _{z_{d,n}})$ .

2. For each pair of documents $d$ , $d'$ :

 (a) Draw binary link indicator   
               $y_{d,d'}|z_{d},z_{d'}\sim \psi (.|z_{d},z_{d'},\eta )$ 
     where  $z_{d}=\{z_{d,1},z_{d,2},...,z_{d,n}\}$

Inference, Estimation and Prediction

Prediction

Link prediction from words

    $p(y_{d,d'}|w_{d},w_{d'})\approx E_{q}[p(y_{d,d'}|{\bar {z_{d}}},{\bar {z_{d'}}})]$

Words prediction from link

    $p(w_{d,i}|y_{d})\approx E_{q}[p(w_{d,i}|z_{d,i})]$

Data

Cora: abstracts + citation link
WebKB: web pages + hyperlinks
PNAS: abstracts + intra-PNAS citation
LocalNews: local news of each state in U.S + geographical adjacency

Results

Evaluating the predictive distribution

Automatic link suggestion

Modeling spatial data

Related papers

Airoldi et al, ML2008 AIROLDI, E., BLEI, D., FIENBERG, S. and XING, E. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.

Blei et al, NIPS2007 BLEI, D. M. and MCAULIFFE, J. D. (2007). Supervised topic models. In Neural Information Processsing Systems. Vancouver.

Dietz et al, ICML2007 DIETZ, L., BICKEL, S. and SCHEFFER, T. (2007). Unsupervised prediction of citation influences. In Proc. ICML. Available at http://portal.acm.org/citation.cfm?id=1273526.

Nallapati et al, ACM2008 NALLAPATI, R., AHMED, A., XING, E. P. and COHEN, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 542–550. ACM Press, New York.

@@ Line 1: / Line 1: @@
 == Citation ==
-J. Chang and D. Blei.   Hierarchical relational models for document networks.   Annals of Applied Statistics, 4(1):124–150, 2010
+J. Chang and D. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4(1):124–150, 2010
 == Online version ==
@@ Line 6: / Line 6: @@
 == Motivation ==
-* Network data
+For Network data, such as social networks of friends, citation networks of documents or hyperlinked networks of web pages, people want to point social network members toward new friends, scientific papers toward relevant citations or web pages toward other related pages. They also want to uncover the hidden community structure. This [[Category::paper]] developed a hierarchical model of both network structure and node attributes, based on [[UsesMethod::Latent Dirichlet Allocation]].
-  - social networks of friends
-  - citation networks of documents
-  - hyperlinked networks of web pages
-* “Predictive Models”
-  - point social network members toward new friends
-  - point scientific papers toward relevant citations
-  - point web pages toward other related pages
-* “Descriptive statistics”
-  - uncover the hidden community structure
 == Methodology ==
+[[File:RTM.jpg | 400px]]
 . For each document <math>d</math>:
-   (a) Draw topic proportions <math>\theta_d|\alpha \approx Dir(\alpha)</math>
+   (a) Draw topic proportions <math>\theta_d|\alpha \sim Dir(\alpha)</math>
+  (b) For each word <math>w_{d,n}</math>:
+      i.  Draw assignment <math>z_{d,n}|\theta_d \sim Mult(\theta_d)</math>.
+      ii. Draw word <math>w_{d,n}|z_{d,n},\beta_{1:K} \sim Mult(\beta_{z_{d,n}})</math>.
+. For each pair of documents <math>d</math>,<math>d'</math>:
-   (b) For each word <math>w_{d,n}</math>:
+   (a) Draw binary link indicator
+               <math>y_{d,d'}|z_d,z_{d'} \sim \psi(.|z_d,z_{d'},\eta)</math>
+      where <math>z_d = \{z_{d,1},z_{d,2},...,z_{d,n}\}</math>
+== Inference, Estimation and Prediction ==
+Prediction
+* Link prediction from words
+    <math>p(y_{d,d'}|w_d, w_{d'}) \approx E_q [p(y_{d,d'}|\bar{z_d},\bar{z_{d'}})]</math>
+* Words prediction from link
+    <math>p(w_{d,i}|y_d) \approx E_q [p(w_{d,i}|z_{d,i})]</math>
+== Data ==
+* [[UsesDataset::Cora]]: abstracts + citation link
+* [[UsesDataset::WebKB]]: web pages + hyperlinks
+* [[UsesDataset::PNAS]]: abstracts + intra-PNAS citation
+* [[UsesDataset::LocalNews]]: local news of each state in U.S + geographical adjacency
+== Results ==
+* Evaluating the predictive distribution
+[[File:result1.jpg | 500px]]
+* Automatic link suggestion
+[[File:result2-1.jpg | 500px]]
+[[File:result2-2.jpg | 500px]]
-      i.  Draw assignment <math>z_{d,n}|\theta_d \approx Mult(\theta_d)</math>.
+* Modeling spatial data
+[[File:result3.jpg | 400px]][[File:result4.jpg | 400px]]
-      ii. Draw word <math>w_{d,n}|z_{d,n},\beta_{1:K} \approx Mult(\beta_{z_{d,n}})</math>.
+== Related papers ==
+[[RelatedPaper::Airoldi et al, ML2008]] AIROLDI, E., BLEI, D., FIENBERG, S. and XING, E. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.
-. For each pair of documents <math>d</math>,<math>d'</math>:
+[[RelatedPaper::Blei et al, NIPS2007]] BLEI, D. M. and MCAULIFFE, J. D. (2007). Supervised topic models. In Neural Information Processsing Systems. Vancouver.
-  (a) Draw binary link indicator
+[[RelatedPaper::Dietz et al, ICML2007]] DIETZ, L., BICKEL, S. and SCHEFFER, T. (2007). Unsupervised prediction of citation influences. In Proc. ICML. Available at http://portal.acm.org/citation.cfm?id=1273526.
-               <math>y_{d,d'}|z_d,z_{d'} \approx \psi(.|z_d,z_{d'},\eta)</math>
-      where <math>z_d = \{z_{d,1},z_{d,2},...,z_{d,n}\}</math>
+[[RelatedPaper::Nallapati et al, ACM2008]] NALLAPATI, R., AHMED, A., XING, E. P. and COHEN, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 542–550. ACM Press, New York.

Difference between revisions of "Chang and Blei, AOAS2010"

Latest revision as of 15:40, 31 March 2011

Contents

Citation

Online version

Motivation

Methodology

Inference, Estimation and Prediction

Data

Results

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools