LabelPropagation

From Cohen Courses
Jump to navigationJump to search

The Method for graph propagation by Velikovich_et_al,_NAACL_2010, to deal with noisy graphs.

This interpretation may not be entirely correct; it's inferred from their writeup.

The algorithm takes as input a graph with defined edge weights; e.g. pairwise distributional similarity scores among terms. Starting with a set of seed words (e.g. words with positive sentiment orientation), it propagates flow weights (e.g. "positiveness") along the graph.

First, it constructs a new bipartite graph between seeds and non-seeds. I'll call these "flow edges" for lack of a better term. The flow edge ("score"?) between a seed and non-seed is the maximum of score of all paths between them, where a path where is the seed and is the target non-seed, is a multiplicative score:

for original graph edge weights , and discount rate .

The paper doesn't define this objective function; instead, they define an iterative procedure which appears to compute this, but stops after a maximum number of iterations ; since the flow weight quickly decays, it's only necessary to consider paths up to size .

The final score for a non-seed is the sum of these max-multiplicative path scores.

The argument is that this prevents overcounting for noisily weighted edges and highly connected subgraphs, since only a single path can influence the impact of one seed on a non-seed.

Question: Is this related to the max-product algorithm from belief propagation?