Difference between revisions of "Sparse Additive Generative Models of Text"
Line 10: | Line 10: | ||
== Methodology == | == Methodology == | ||
− | The key insight of this paper is that a generative model can be thought of as a deviation from a background distribution in log-space. | + | The key insight of this paper is that a generative model can be thought of as a deviation from a background distribution in log-space. The authors propose their method to deal with a few main problems that they see in the Dirichlet-multinomial generative models: [[AddressesProblem::Overfitting]], [[AddressesProblem::Overparametrization]], [[AddressesProblem::Inference Cost]], and [[AddressesProblem::Lack of Sparsity]]. |
[[File:sage.png]] | [[File:sage.png]] | ||
== Experimental Results == | == Experimental Results == |
Revision as of 08:01, 4 October 2012
This Paper is available online [1].
Summary
Sparse Additive Generative Models of Text, or SAGE, is an interesting alternative to traditional generative models for text. The key insight of the paper is that you can model latent classes or topics as a deviation in log-frequency from a constant background distribution. It has the advantage of enforcing sparsity which the authors argue prevents over-fitting. Additionally, generative facets can be combined through addition in the log space, avoiding the need for switching variables.
Datasets
Methodology
The key insight of this paper is that a generative model can be thought of as a deviation from a background distribution in log-space. The authors propose their method to deal with a few main problems that they see in the Dirichlet-multinomial generative models: Overfitting, Overparametrization, Inference Cost, and Lack of Sparsity.