Hierarchical Dirichlet process

From Cohen Courses
Revision as of 14:46, 31 March 2011 by Yanbox (talk | contribs) (Created page with 'A hierarchical Dirichlet process is a distribution over a set of random probability measures over <math>(\theta; B)</math>. The process defines a set of random probability measur…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

A hierarchical Dirichlet process is a distribution over a set of random probability measures over . The process defines a set of random probability measures , one for each group, and a global random probability measure . The global measure is distributed as a Dirichlet process with concentration parameter and base probability measure H:

and the random measures are conditionally independent given G0, with distributions given by a Dirichlet process with base probability measure :

.

A hierarchical Dirichlet process can be used as the prior distribution over the factors for grouped data. For each j let be i.i.d. random variables distributed as . Each is a factor corresponding to a single observation . The likelihood is given by:

.

The hierarchical Dirichlet process can readily be extended to more than two levels. That is, the base measure H can itself be a draw from a DP, and the hierarchy can be extended for as many levels as are deemed useful.


  • The stick-breaking construction

Given that the global measure is distributed as a Dirichlet process, it can be expressed using a stick-breaking representation:

where independently and are mutually independent. Since has support at the points , each necessarily has support at these points as well, and can thus be written as:

Let . Note that the weights are independent given (since the are independent given ). These weights are related to the global weights .

An equivalent representation of the hierarchical Dirichlet process mixture can be:

.

After some derivations, the relation between weights and is:

.


  • Chinese restaurant franchise

Crf.jpg

The restaurants correspond to groups and the customers correspond to the factors . Let denote K i.i.d. random variables distributed according to H; this is the global menu of dishes. Vairables represent the table-specific choice of dishes; particular, is the dish served at table t and restaurant j. Use notation to denote the number of maintain counts of customers and counts of tables. Then,

,