Coreference Resolution in a Modular, Entity-Centered Model

From Cohen Courses
Jump to navigationJump to search

Citation

A. Haghighi and D. Klein. 2010. Coreference Resolution in a Modular, Entity-Centered Model. ACL 2010.

Summary

This Paper presents a mostly unsupervised approach to coreference resolution, using a hierarchical generative process, represented as a probabilistic graphical model. The system deals with a hierarchy of mentions which are present in the text, abstract entities, and types, which are classes of entities. This allows the model to make generalizations across multiple different entities. The system uses a hierarchical generative process, where first a list of entities is drawn by drawing a list of types, then an entity from each of those types. Then, mentions are drawn from the entities using a sequential distance-dependent chinese restaurant process. Finally, each of these mentions generates a surface realization. To train the model, each level of the generative hierarchy. is updated using EM in turn, until all have converged. This is an approximation to just running normal EM, which would be computationally infeasible given the model. All of the training is done on unlabeled data, except for prototypes of the types which are hard-coded in at the beginning of training. The explicit modeling of discourse is interesting in this paper, and seems like a good starting point for generative models of discourse that could be incorporated into other models.

Datasets used

ACE corpora coreference data set.