Difference between revisions of "Class meeting for 10-605 LDA"

From Cohen Courses
Jump to navigationJump to search
Line 22: Line 22:
 
* [http://people.cs.umass.edu/~mimno/papers/fast-topic-model.pdf Efficient Methods for Topic Model Inference on Streaming Document Collections], Yao, Mimno, McCallum KDD 2009.
 
* [http://people.cs.umass.edu/~mimno/papers/fast-topic-model.pdf Efficient Methods for Topic Model Inference on Streaming Document Collections], Yao, Mimno, McCallum KDD 2009.
 
* [http://dl.acm.org/citation.cfm?id=2623756 Reducing the sampling complexity of topic models], Li, Ahmed, Ravi, & Smola, KDD 2014
 
* [http://dl.acm.org/citation.cfm?id=2623756 Reducing the sampling complexity of topic models], Li, Ahmed, Ravi, & Smola, KDD 2014
* [http://arxiv.org/abs/1412.1576 LightLDA: Big Topic Models on Modest Compute Clusters], Jinhui Yuan, Fei Gao, Qirong Ho, Wei Dai, Jinliang Wei, Xun Zheng, Eric P. Xing, Tie-Yan Liu, Wei-Ying Ma, 2015
+
* [https://dl.acm.org/citation.cfm?id=2741682 A Scalable Asynchronous Distributed Algorithm for Topic Modeling], Yu, Hsieh, Yun, Vishwanathan, Dillon, WWW 2015
  
 
=== Things to remember ===
 
=== Things to remember ===

Revision as of 15:28, 19 November 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2016.

Slides

Quiz

Readings

Basic LDA:

  • Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent Dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.

Speedups for LDA:

Things to remember

  • How Gibbs sampling is used to sample from a model.
  • The "generative story" associated with key models like LDA, naive Bayes, and stochastic block models.
  • What a "mixed membership" generative model is.
  • The time complexity and storage requirements of Gibbs sampling for LDAs.
  • How LDA learning can be sped up using IPM approaches.
  • Why efficient sampling is important for LDAs
  • How sampling can be sped up for many topics by preprocessing the parameters of the distribution
  • How the storage used for LDA can be reduced by exploiting the fact that many words are rare.