E.A. Leicht, Structure of Time Evo citation networks 2007
E.A. Leicht, G. Clarkson, K. Shedden, and M.E.J. Newman.2007. Large-scale structure of time evolving citation networks.In European Physical Journal B.-Volume 59, P75–83.
This paper uses three methods to examine the structure of large-scale networks (focus on citation networks) that evolve over time. This paper demonstrates how each of these methods can divide the structure of large-scale network. A network of citations between opinions of the United States Supreme Court is used as an example in this paper.
Brief Description of Three Analysis Methods
- A mixture model of citation process makes use of expectation-maximization algorithm.
This method divides vertices into groups which have similar time profiles to their citations Suppose there are n vertices representing documents in a network, it can be divided into c groups. Then a log-likelihood function is given, by maximizing this function, a best estimate of the most likely values of the model parameters can be calculated. This process involves two steps: 1.estimate the group member probabilities; 2. use the obtained probabilities to maximize the log-likelihood function. Through a few steps mathematical inference and proof, this paper reaches its conclusion the division process by using this model is self-consistent. Some examples are also given as a demonstration of this method.
- A clustering method in citation network.
This method is a community analysis groups vertices which is linked to one another by edges. The method make use a method proposed by Newman based on the maximization of the benefit function known as "modularity". The signs of the elements of the leading eigenvector of Newman's "modularity matrix" give an approximation to the division of the network into two parts. By repeatedly dividing the network in two in this way, a network can be divided into any number of communities. A compare between this analysis and the EM analysis is given, the result of this algorithm is similar to the one of EM algorithm. However, since the clustering analysis ignores edge direction, there are some differences between the two results.
- Vertex authority score method
This method is known as centrality scores,which quantify the importance or influence of individual vertices in a network. In a directed network such as a citation network, there are two degrees, the in-degree and the out-degree. It is reasonable, for instance, to imagine that important or influential vertices in a citation network will receive many citations and therefore have high in-degree. So,each vertex has two centralities, the authority score and the hub score, the first one derived from the incoming links and the second from the outgoing links. Hub and authority scores have been applied to Supreme Court cases previously by Fowler and Jeon, this paper refers to the result of them. The experiment of this method shows a repeated pattern in the evolution of the citation network.