Difference between revisions of "Markov Logic Networks"
(10 intermediate revisions by 2 users not shown) | |||
Line 3: | Line 3: | ||
== Definition == | == Definition == | ||
− | A Markov Logic Network L is a set of pairs <math>(F_{i}, w_{i})</math> where <math>F_{i}</math> is a formula in first-order logic and <math>w_{i}</math> is a real number. Given C is a finite set of constants <math>C = {c_{1}, c_{2}, ...}</math>, L defines together with C a Markov Network <math>M_{L, C}</math>: | + | A Markov Logic Network <math>L</math> is a set of pairs <math>(F_{i}, w_{i})</math> where <math>F_{i}</math> is a formula in first-order logic and <math>w_{i}</math> is a real number. Given <math>C</math> is a finite set of constants <math>C = \{ c_{1}, c_{2}, ... \}</math>, <math>L</math> defines together with <math>C</math> a Markov Network <math>M_{L, C}</math> in the following way: |
− | * <math>M_{L, C}</math> contains one binary node for each possible grounding of each predicate in L. If the ground atom is true, the value of the node is 1, and 0 otherwise. | + | * <math>M_{L, C}</math> contains one binary node for each possible grounding of each predicate in <math>L</math>. If the ground atom is true, the value of the node is 1, and 0 otherwise. |
− | * ''M<sub>L, C</sub>'' contains one | + | * ''M<sub>L, C</sub>'' contains one feature for each possible grounding of each formula <math>F_{i}</math> in <math>L</math>. If the ground formula is true, the value of the feature is 1, and 0 otherwise; and the weight of the feature is <math>w_{i}</math>. |
+ | |||
+ | In other words, the Markov network has an edge between two nodes ''iff'' the corresponding predicates appear together in at least one formula in <math>L</math>. Then, the probability distribution over possible instances <math>x</math> specified by ''M<sub>L, C</sub>'' is: | ||
+ | |||
+ | <math>P(X=x) = \frac{1}{Z} \prod_{i}{\phi_{i}(x_{\{i\}})}</math> | ||
+ | |||
+ | where <math>Z</math> is the normalization factor, <math>\phi_{i}(x_{\{i\}})</math> is the potential function defined on the <math>i</math>-th clique which is related to <math>F_{i}</math>, and <math>x_{\{i\}}</math> is the discrete variable vector in the clique. Given that <math>\phi_{i}(x_{\{i\}})</math> equals <math>e^{w_{i}}</math> when <math>F_{i}(x_{\{i\}})</math> is True and 1 otherwise, we can represent probability as: | ||
+ | |||
+ | <math>P(X=x) = \frac{1}{Z} \text{ exp }\left \{ {\sum_{i}{w_{i}n_{i}(x)}} \right \}</math> | ||
+ | |||
+ | where <math>n_{i}(x)</math> is the number of true groundings of <math>F_{i}</math> in <math>x</math>. | ||
+ | |||
+ | == Relationship to Markov Networks == | ||
+ | |||
+ | MLNs are special cases of Markov networks, where the vertices of the network graph are atomic formulas and the edges are the logical connectives used to construct the formula. Each formula is considered to be a clique and the Markov blanket is the set of formulas in which a given atom appears. A potential function is associated to each formula and it is combined with the weight to form the Gibbs measure and partition function for the Markov network. | ||
+ | |||
+ | Therefore, inference in MLNs can be performed using standard Markov network inference techniques like Gibbs sampling, belief propagation and approximation via pseudolikelihood. | ||
== Relevant Papers == | == Relevant Papers == | ||
Line 15: | Line 31: | ||
| ?UsesDataset | | ?UsesDataset | ||
}} | }} | ||
+ | |||
+ | == Comments == | ||
+ | |||
+ | Just one comment, sometimes standard MRF inference techniques don't work well for MLN's because the (1) the state space is very large, and (2) deterministic or near-deterministic dependencies cause problems for inference (esp. for MCMC). One of the original Domingos papers talks a bit about a hybrid SAT and MCMC algorithm to get around this issue. --[[User:Brendan|Brendan]] 18:45, 13 October 2011 (UTC) |
Latest revision as of 13:45, 13 October 2011
This is a method that combines first-order logic and probabilistic graphical models. In first-order logic, a set of formulas represent hard constraints over a set of instances, and if an instance violates one of them, it has zero probability. The basic idea of a Markov Logic Network (MLN) is to generalize first-order logic by softening those hard constraints, assigning a real number (the weight) to each formula to indicate how hard it is, so that an instance that violates one or more formulas is not impossible anymore, just less probable.
Definition
A Markov Logic Network is a set of pairs where is a formula in first-order logic and is a real number. Given is a finite set of constants , defines together with a Markov Network in the following way:
- contains one binary node for each possible grounding of each predicate in . If the ground atom is true, the value of the node is 1, and 0 otherwise.
- ML, C contains one feature for each possible grounding of each formula in . If the ground formula is true, the value of the feature is 1, and 0 otherwise; and the weight of the feature is .
In other words, the Markov network has an edge between two nodes iff the corresponding predicates appear together in at least one formula in . Then, the probability distribution over possible instances specified by ML, C is:
where is the normalization factor, is the potential function defined on the -th clique which is related to , and is the discrete variable vector in the clique. Given that equals when is True and 1 otherwise, we can represent probability as:
where is the number of true groundings of in .
Relationship to Markov Networks
MLNs are special cases of Markov networks, where the vertices of the network graph are atomic formulas and the edges are the logical connectives used to construct the formula. Each formula is considered to be a clique and the Markov blanket is the set of formulas in which a given atom appears. A potential function is associated to each formula and it is combined with the weight to form the Gibbs measure and partition function for the Markov network.
Therefore, inference in MLNs can be performed using standard Markov network inference techniques like Gibbs sampling, belief propagation and approximation via pseudolikelihood.
Relevant Papers
Comments
Just one comment, sometimes standard MRF inference techniques don't work well for MLN's because the (1) the state space is very large, and (2) deterministic or near-deterministic dependencies cause problems for inference (esp. for MCMC). One of the original Domingos papers talks a bit about a hybrid SAT and MCMC algorithm to get around this issue. --Brendan 18:45, 13 October 2011 (UTC)