Difference between revisions of "Class meeting for 10-605 LDA"

Revision as of 16:28, 19 November 2017

How Gibbs sampling is used to sample from a model.
The "generative story" associated with key models like LDA, naive Bayes, and stochastic block models.
What a "mixed membership" generative model is.
The time complexity and storage requirements of Gibbs sampling for LDAs.
How LDA learning can be sped up using IPM approaches.

Why efficient sampling is important for LDAs
How sampling can be sped up for many topics by preprocessing the parameters of the distribution
How the storage used for LDA can be reduced by exploiting the fact that many words are rare.

@@ Line 22: / Line 22: @@
 * [http://people.cs.umass.edu/~mimno/papers/fast-topic-model.pdf Efficient Methods for Topic Model Inference on Streaming Document Collections], Yao, Mimno, McCallum KDD 2009.
 * [http://dl.acm.org/citation.cfm?id=2623756 Reducing the sampling complexity of topic models], Li, Ahmed, Ravi, & Smola, KDD 2014
-* [http://arxiv.org/abs/1412.1576 LightLDA: Big Topic Models on Modest Compute Clusters], Jinhui Yuan, Fei Gao, Qirong Ho, Wei Dai, Jinliang Wei, Xun Zheng, Eric P. Xing, Tie-Yan Liu, Wei-Ying Ma, 2015
+* [https://dl.acm.org/citation.cfm?id=2741682 A Scalable Asynchronous Distributed Algorithm for Topic Modeling], Yu, Hsieh, Yun, Vishwanathan, Dillon, WWW 2015
 === Things to remember ===