Co-clustering documents and words using bipartite spectral graph partitioning

From Cohen Courses
Jump to navigationJump to search

Citation

M.E.J.Newman. 2001. The Structure of Scientific Collaboration Networks. Proceedings of the National Academy of Sciences. 404-409.

Online Version

http://www.cs.utexas.edu/users/inderjit/public_papers/kdd_bipartite.pdf

Databases

MEDLINE (biomedical research)[1]

Los Alamos e-Print Archive (physics)[2]

NCSTRL (computer science)[3]

Summary

This is a paper investigating the structure of scientific collaboration. The author ulitized data from a number of databases in different fields: Biomedical, Physics and Computer Science. Properties of these networks are:

  • In all cases, scientific communities seem to constitute a ‘‘small world,’’[4] in which the average distance between scientists via a line of intermediate collaborators varies logarithmically with the size of the relevant community.
  • Those networks are highly clustered, meaning that two scientists are much more likely to have collaborated if they have a third common collaborator than are two scientists chosen at random from the community.
  • Distributions of both the number of collaborators of scientists and the numbers of papers are well fit by power-law forms with an exponential cutoff. This cutoff may be caused by the finite time window (1995-1999) used in the study.
  • There are a number of significant statistical differences between different scientific communities. Some of these are obvious.