Difference between revisions of "Co-clustering documents and words using bipartite spectral graph partitioning"

From Cohen Courses
Jump to navigationJump to search
Line 4: Line 4:
 
== Online Version ==
 
== Online Version ==
 
http://www.cs.utexas.edu/users/inderjit/public_papers/kdd_bipartite.pdf
 
http://www.cs.utexas.edu/users/inderjit/public_papers/kdd_bipartite.pdf
 
== Databases ==
 
[[Category::MEDLINE]] (biomedical research)[http://www.nlm.nih.gov/databases/databases_medline.html]
 
 
[[Category::Los Alamos e-Print Archive]] (physics)[http://xxx.lanl.gov/]
 
 
[[Category::NCSTRL]] (computer science)[http://www.ncstrl.org/]
 
  
 
== Summary ==
 
== Summary ==

Revision as of 01:49, 28 March 2011

Citation

M.E.J.Newman. 2001. The Structure of Scientific Collaboration Networks. Proceedings of the National Academy of Sciences. 404-409.

Online Version

http://www.cs.utexas.edu/users/inderjit/public_papers/kdd_bipartite.pdf

Summary

This is a paper investigating the structure of scientific collaboration. The author ulitized data from a number of databases in different fields: Biomedical, Physics and Computer Science. Properties of these networks are:

  • In all cases, scientific communities seem to constitute a ‘‘small world,’’[1] in which the average distance between scientists via a line of intermediate collaborators varies logarithmically with the size of the relevant community.
  • Those networks are highly clustered, meaning that two scientists are much more likely to have collaborated if they have a third common collaborator than are two scientists chosen at random from the community.
  • Distributions of both the number of collaborators of scientists and the numbers of papers are well fit by power-law forms with an exponential cutoff. This cutoff may be caused by the finite time window (1995-1999) used in the study.
  • There are a number of significant statistical differences between different scientific communities. Some of these are obvious.