Difference between revisions of "Gibbs sampling"
Line 12: | Line 12: | ||
2. Repeat for <math>t = 1, 2, \cdots, </math>: | 2. Repeat for <math>t = 1, 2, \cdots, </math>: | ||
− | + | For <math>k = 1, 2, \cdots, K \mbox{ generate } X_k^{(t)} \mbox{ from } f(X_k^{(t)} | X_1^{(t)}, \cdots, X_{k-1}^{(t)}, X_{k+1}^{(t-1)}, \cdots X_K^{(t-1)})</math> | |
− | 3. Continue step 2 until joint distribution of <math>(X_1^{(t)}, \cdots, X_K^{(t)})</math> doesn't change. | + | 3. Continue step 2 until joint distribution of <math>f(X_1^{(t)}, \cdots, X_K^{(t)})</math> doesn't change. |
− | == A Simple proof of Convergence == | + | Under regularity conditions, it can be shown that this procedure eventually stabilizes, and the resulting random variables are indeed a sample from <math>X_1, \cdots, X_K</math> |
+ | |||
+ | == A Simple proof of Convergence for bivariate case== | ||
+ | |||
+ | Consider a bivariate system of bernoulli random variables <math>X</math> and <math>Y</math>. Define two matrices <math>A_{x|y}</math> and <math>A_{y|x}</math> such that <math>A_{x|y}(i, j) = f(x=i | y=j)</math> and <math>A_{y|x}(i, j) = f(y=i | x=j)</math>. Then the transition probability of <math>X</math> to <math>X</math> can be given as <math>A_{x|x} = A_{x|y} A_{y|x}</math>. If the initial distribution to start with was <math>f_0 = [f_0(1), f_0(2)]^T</math>, then at the <math>k^{th}</math> iteration, <math>f_k = f_0 A_{x|x}^k</math>. It is well known that as <math>k \rightarrow \infty</math>, <math>f_k</math> approaches a stationary point. The stationary point represents <math>f(X)</math>. | ||
== Burnout == | == Burnout == | ||
− | + | Gibbs sampler requires certain number of iterations until it approaches the stationary state, and generate samples from the marginal distribution. To allow this, first few samples (typically in the order of 500-1000) are discarded. This is known as burnout. | |
+ | |||
== Used In == | == Used In == | ||
+ | |||
+ | |||
== References == | == References == | ||
Line 29: | Line 36: | ||
2. http://biostat.jhsph.edu/~mmccall/articles/casella_1992.pdf | 2. http://biostat.jhsph.edu/~mmccall/articles/casella_1992.pdf | ||
+ | |||
+ | 3. Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning. |
Revision as of 13:59, 30 September 2011
Under modification by User:dkulkarn
Gibbs sampling is used to sample from the stable joint distribution of two or more random variables when accurate computation of the integral or a marginal is intractable. Usually some variables in this set of random variables are the actual observables and hence there values need not be sampled in the Gibbs sampling iterations. This form of approximate inference method is generally used when doing posterior probability inference in probabilistic graphical models where computation of marginals are intractable.
Contents
Motivation
Gibbs sampling was introduced in the context of image processing by Geman and Geman[1]. The Gibbs sampler is a technique for generating random variables from a (marginal) distribution indirectly, without having to calculate the density[2]. Thus, if we are given with conditional densities , we can use Gibbs sampling to calculate the marginal distributions or any other function of .
Algorithm
1. Take some initial values
2. Repeat for :
For
3. Continue step 2 until joint distribution of doesn't change.
Under regularity conditions, it can be shown that this procedure eventually stabilizes, and the resulting random variables are indeed a sample from
A Simple proof of Convergence for bivariate case
Consider a bivariate system of bernoulli random variables and . Define two matrices and such that and . Then the transition probability of to can be given as . If the initial distribution to start with was , then at the iteration, . It is well known that as , approaches a stationary point. The stationary point represents .
Burnout
Gibbs sampler requires certain number of iterations until it approaches the stationary state, and generate samples from the marginal distribution. To allow this, first few samples (typically in the order of 500-1000) are discarded. This is known as burnout.
Used In
References
1. Geman and Geman
2. http://biostat.jhsph.edu/~mmccall/articles/casella_1992.pdf
3. Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning.