Difference between revisions of "Melia et al AISTATS 2001"
(13 intermediate revisions by the same user not shown) | |||
Line 9: | Line 9: | ||
== Summary == | == Summary == | ||
− | This paper gives a general theoretical interpretation of a wide variety of spectral methods. The authors first present the general framework of normalized cuts. We assume we are given an index set <math>I</math> and similarity matrix <math>S</math> where entry <math>S_{ij}</math> represents a similarity between item <math>i</math> and item <math>j</math>. We then seek a partition of <math>I</math> into two sets <math>A,\overline{A}</math>. | + | This [[Category::paper]] focuses on [[AddressesProblem::clustering]] and gives a general theoretical interpretation of a wide variety of spectral methods. Drawing on the general problem of normalized graph cuts, the authors show that the problem can be interpreted in terms of a random walk. |
+ | |||
+ | The authors first present the general framework of normalized cuts. We assume we are given an index set <math>I</math> and similarity matrix <math>S</math> where entry <math>S_{ij}</math> represents a similarity between item <math>i</math> and item <math>j</math>. We then seek a partition of <math>I</math> into two sets <math>A,\overline{A}</math>. This partition is a normalized cut if it minimizes the following criteria: | ||
+ | |||
+ | <math> | ||
+ | NCut(A, \overline{A}) = \left( \frac{1}{Vol(A)} + \frac{1}{Vol(\overline{A})}\right) \sum_{i \in A; j \in \overline{A}} S_{ij}, | ||
+ | </math> | ||
+ | |||
+ | where, for <math>C \subseteq I </math> , we define: | ||
+ | |||
+ | <math> | ||
+ | Vol(C) = \sum_{i \in C} \sum_{j\in I} S_{ij}. | ||
+ | </math> | ||
+ | |||
+ | However, the above problem is computationally intractable in general. We can approximate solutions to this problem using the following generalized eigenvalue problem: | ||
+ | |||
<math> | <math> | ||
− | + | Lx = \lambda D x | |
+ | </math> | ||
+ | Here, <math>D</math> is a diagonal matrix with <math>D_{ii} = \sum_{j \in I} S_{ij}</math> (we define this as the degree of <math>i</math>), and <math>L = D - S</math>.. By using the smallest eigenvectors from this solution, and looking for index sets in which the entry values are constant, we can get approximate solutions to the normalized graph cuts problem. | ||
+ | The authors recast the problem by simply rewriting the eigenvalue problem as: | ||
+ | <math> | ||
+ | D^{-1}Sx = \lambda x | ||
</math> | </math> | ||
+ | |||
+ | Now defining <math>P = D^{-1}S</math>, we can clearly see that each row of <math>P</math> sums to 1. Therefore, we can interpret it as a transition matrix of a markov chain, where entry <math>P_{ij}</math> represents the probability of transitioning from index <math>i</math> to index <math>j</math>. This can be thought of as a random walk over the index set. We can thus interpret the a eigenvalue in this problem as groups of index sets which have similar transition probabilities to any other indices. That is, if we group our indices by the similar entries in an eigenvector, we have found subsets of <math>I</math> which have similar transition probabilities to each other. | ||
+ | |||
+ | The authors consequently propose the following general approach. (1) Solve the eigenvalue problem <math>D^{-1}Sx = \lambda x </math> (2) select the k-1 eigenvectors with the k-1 largest eigenvalues -- excluding the largest eigenvalue and (3) clustering the entries as if they were k-1 dimensional vectors. This approach gives an approximate solution to the normalized graph cuts problem. | ||
+ | |||
+ | == Application to Social Media == | ||
+ | |||
+ | The paper does not mention specific applications, so I make a brief note here about this. For social media, a common problem is desiring to partition a large social network, represented as a graph, into several sections. For example, perhaps we want to try and identify different countries from a large social graph of online friendship. The above method yields solutions to just such a problem: partitioning of the index set corresponds to clustering the nodes in a graph. | ||
+ | |||
+ | However, there is one important practical piece which is (understandably) left out of this paper. It is assumed that a similarity matrix between all nodes is available. While this may seem simple, there are many ways to calculate the similarity between nodes in a graph. This decision is often critical in the analysis, and is essential to random walk interpretation provided by this paper: the transition probabilities defined above are directly related to the similarity between the nodes. | ||
+ | |||
+ | Another example of a related application is [[UsesMethod :: Latent semantic indexing]]. | ||
== Related papers == | == Related papers == |
Latest revision as of 12:01, 7 February 2011
Citation
Marina Melia and Jianbo Shi. 2001. A Random Walks View of Spectral Segmentation. In AISTATS 2001.
Online version
Available on Marina Melia's Website
Summary
This paper focuses on clustering and gives a general theoretical interpretation of a wide variety of spectral methods. Drawing on the general problem of normalized graph cuts, the authors show that the problem can be interpreted in terms of a random walk.
The authors first present the general framework of normalized cuts. We assume we are given an index set and similarity matrix where entry represents a similarity between item and item . We then seek a partition of into two sets . This partition is a normalized cut if it minimizes the following criteria:
where, for , we define:
However, the above problem is computationally intractable in general. We can approximate solutions to this problem using the following generalized eigenvalue problem:
Here, is a diagonal matrix with (we define this as the degree of ), and .. By using the smallest eigenvectors from this solution, and looking for index sets in which the entry values are constant, we can get approximate solutions to the normalized graph cuts problem.
The authors recast the problem by simply rewriting the eigenvalue problem as:
Now defining , we can clearly see that each row of sums to 1. Therefore, we can interpret it as a transition matrix of a markov chain, where entry represents the probability of transitioning from index to index . This can be thought of as a random walk over the index set. We can thus interpret the a eigenvalue in this problem as groups of index sets which have similar transition probabilities to any other indices. That is, if we group our indices by the similar entries in an eigenvector, we have found subsets of which have similar transition probabilities to each other.
The authors consequently propose the following general approach. (1) Solve the eigenvalue problem (2) select the k-1 eigenvectors with the k-1 largest eigenvalues -- excluding the largest eigenvalue and (3) clustering the entries as if they were k-1 dimensional vectors. This approach gives an approximate solution to the normalized graph cuts problem.
Application to Social Media
The paper does not mention specific applications, so I make a brief note here about this. For social media, a common problem is desiring to partition a large social network, represented as a graph, into several sections. For example, perhaps we want to try and identify different countries from a large social graph of online friendship. The above method yields solutions to just such a problem: partitioning of the index set corresponds to clustering the nodes in a graph.
However, there is one important practical piece which is (understandably) left out of this paper. It is assumed that a similarity matrix between all nodes is available. While this may seem simple, there are many ways to calculate the similarity between nodes in a graph. This decision is often critical in the analysis, and is essential to random walk interpretation provided by this paper: the transition probabilities defined above are directly related to the similarity between the nodes.
Another example of a related application is Latent semantic indexing.