# Heckerman, JMLR 2000

## Citation

Dependency Networks for Inference, Collaborative Filtering, and Data Visualization. David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, Carl Kadie; in JMLR, 1(Oct):49-75, 2000

## Summary

In this paper, author describe a graphical model for probabilistic relationship, an alternative to the bayesian network, called dependency network. The dependency network, unlike bayesian network is potentially cyclic. The dependency network are well suited to task of predicting preferences like in collaborative filtering. The dependency network is not good for encoding causal relationship.

## Brief description of the method

• Consistent Dependency Network

Given a domain of interest having set of finite variables ${\displaystyle X=(X_{1},...,X_{n})}$ with a positive joint distribution ${\displaystyle p(x)}$, a consistent dependency network for ${\displaystyle X}$ is a pair ${\displaystyle (G,P)}$ where ${\displaystyle G}$ is a cyclic directed graph and ${\displaystyle P}$ is a set of conditional probability distributions. The parents of node ${\displaystyle X_{i}}$, denoted ${\displaystyle Pa_{i}}$, correspond to those variable ${\displaystyle Pa_{i}\subseteq (X_{1},..,X_{i-1},X_{i+1},...,X_{n})}$ that satisfy: ${\displaystyle p(x_{i}|pa_{i})=p(x_{i}|x_{1},....,x_{i-1},x_{i+1},...,x_{n})=p(x_{i}|x\setminus x_{i})}$

The dependency network is consistent in the sense that each local distribution can be obtained via inference from the joint distribution ${\displaystyle p(x)}$.

The inference can be perform by converting to a markov network, triangulating that network and then applying one of the standard algorithms for the inference.

• General Dependency Network

Given a domain of interest having set of finite variables ${\displaystyle X=(X_{1},...,X_{n})}$, let ${\displaystyle P=(p_{1}(x_{1}|x\setminus x_{1}),...,p_{n}(x_{n}|x\setminus x_{m}))}$ be a set of conditional distribution, one for each variable in ${\displaystyle X}$. A dependency network for ${\displaystyle X}$ and ${\displaystyle P}$ is a pair ${\displaystyle (G,P')}$ where ${\displaystyle G}$ is a (usually cyclic) directed graph and ${\displaystyle P'}$ is a set of conditional probability distribution satisfying ${\displaystyle p_{i}(x_{i}|Pa_{i})=p_{i}(x_{i}|x\setminus x_{i})}$

## Experimental Result

They experimented it on the problem of collaborative filtering on three datasets - Nielsen, MS.COM and MSNBC. Overall Bayesian network are slightly more accurate but dependency network are significantly faster at prediction.