The cora dataset is maintained by Andrew McCallum, and there are multiple versions, for different research problems like information extraction, correference resolution, and classification using network information. The network data set is described here.

The cora network consists of around 37000 papers and 715000 citations between them. Each paper also has a research-area classification label associated with it. Different subsets of this data have been used for different papers.

