Difference between revisions of "Maximization of the benefit function known as "modularity""

Latest revision as of 02:04, 7 February 2011

Briefly the method works as follows. Given a network and a particular division of the vertices of that network into nonoverlapping groups or communities, the modularity is defined as the number of edges that lie within those groups minus the expected number of such edges if edges are placed at random between the vertices (but respecting vertex degree). In essence, the modularity measures whether a larger than expected number of edges fall within the groups defined. In principle, the task of finding the best division of the network into groups is then one of maximizing the modularity over all possible divisions. In practice, this maximization problem is known to be NP-complete, so approximate solution methods must be used for all but the smallest networks.

Newman’s method works by rewriting the modularity in the language of linear algebra as a quadratic form involving an index vector and a characteristic matrix dubbed the “modularity matrix.” It can then be shown that the signs of the elements of the leading eigenvector of this modularity matrix give an approximation to the division of the network into two parts that maximizes the modularity. This approximate maximum can optionally be further refined by, for instance, applying a greedy algorithm that moves vertices between groups as described in . By repeatedly dividing the network in two in this way, a network can be divided into any number of communities, although typically one stops dividing when no divisions exist that will increase the modularity any further.

This repeated subdivision of the network into smaller and smaller groups is particularly attractive for the purposes of our present analysis, because it allows us to observe the major divisions in the network first, followed by more minor ones, and to stop the process at any point to compare with our other analyses. A limitation of the method is that it is designed for use with undirected rather than directed networks. This however is not a great hindrance. It seems reasonable to consider edges in a citation network to be a sign of connection between documents, and that connection exists regardless of the direction the edge runs in. So we simply ignore the directions in our analysis and apply the eigenvector calculation to the undirected network. This approach has been taken before by other authors and appears to work well.

Reference

Modularity and community structure in networks

Modularity knowledge on Wikipedia

Difference between revisions of "Maximization of the benefit function known as "modularity""

Latest revision as of 02:04, 7 February 2011

Reference

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
+Briefly the method works as follows.
+Given a network and a particular division of the vertices of that network
+into nonoverlapping groups or communities, the modularity
+is defined as the number of edges that lie within
+those groups minus the expected number of such edges
+if edges are placed at random between the vertices (but
+respecting vertex degree). In essence, the modularity
+measures whether a larger than expected number of
+edges fall within the groups defined. In principle, the task
+of finding the best division of the network into groups is
+then one of maximizing the modularity over all possible
+divisions. In practice, this maximization problem is
+known to be NP-complete, so approximate solution
+methods must be used for all but the smallest networks.
+Newman’s method works by rewriting the modularity in
+the language of linear algebra as a quadratic form involving
+an index vector and a characteristic matrix dubbed the “modularity matrix.” It can then be shown that the
+signs of the elements of the leading eigenvector of this
+modularity matrix give an approximation to the division
+of the network into two parts that maximizes the modularity.
+This approximate maximum can optionally be further
+refined by, for instance, applying a greedy algorithm
+that moves vertices between groups as described in .
+By repeatedly dividing the network in two in this way, a
+network can be divided into any number of communities,
+although typically one stops dividing when no divisions
+exist that will increase the modularity any further.
+This repeated subdivision of the network into smaller
+and smaller groups is particularly attractive for the purposes
+of our present analysis, because it allows us to observe
+the major divisions in the network first, followed by
+more minor ones, and to stop the process at any point
+to compare with our other analyses. A limitation of the
+method is that it is designed for use with undirected rather
+than directed networks. This however is not a great hindrance.
+It seems reasonable to consider edges in a citation
+network to be a sign of connection between documents,
+and that connection exists regardless of the direction the
+edge runs in. So we simply ignore the directions in our
+analysis and apply the eigenvector calculation to the undirected
+network. This approach has been taken before by
+other authors and appears to work well.
+== '''Reference''' ==
 [http://www.pnas.org/content/103/23/8577.full.pdf+html Modularity and community structure in networks]
 [http://en.wikipedia.org/wiki/Modularity_(networks) Modularity knowledge on Wikipedia]