Difference between revisions of "Mapping entity names in a document to places on a map"

Latest revision as of 10:32, 8 September 2011

This is a sample project posted by William - although anyone that wants to work on it for real is welcome to!

Place names are often ambiguous - e.g., London can mean London, Ontario or London, England - and frequently, a document will contain many place names. You can view the task associating the correct place with a place-name as a sort of word sense disambiguation (WSD) problem (where an atlas fills the role of a database of word sense.)

The goal of this project is to use structured prediction to predict the set of place "senses" that corresponds to the set of place names in a document. This task is thus similar to all-words WSD.

I have data appropriate for evaluating solutions this problem. GeoNames.org has a database of 6M+ place names with associated lat/long coordinates. Some pages in Wikipedia are tagged with lat/long coordinates, which disambiguates them relative to GeoNames. I've also collected a set of 600k Wikipedia pages with multiple links to pages with lat/long coordinates. Taken together these could be used to perform supervised learning, or to evaluate unsupervised learning techniques.

Proposed by: William Cohen

@@ Line 1: / Line 1: @@
-This is a sample project posted by [[User:Wcohen|William]] - although anyone that wants to work on it for real is welcome to!
+''This is a sample project posted by [[User:Wcohen|William]] - although anyone that wants to work on it for real is welcome to!''
 Place names are often ambiguous - e.g., London can mean London, Ontario or London, England - and frequently, a document will contain many place names.  You can view the task associating the correct place with a place-name as a sort of word sense disambiguation (WSD) problem (where an atlas fills the role of a database of word sense.)
@@ Line 6: / Line 6: @@
 I have data appropriate for evaluating solutions this problem.  GeoNames.org has a database of 6M+ place names with associated lat/long coordinates.  Some pages in Wikipedia are tagged with lat/long coordinates, which disambiguates them relative to GeoNames.   I've also collected a set of 600k Wikipedia pages with multiple links to pages with lat/long coordinates.  Taken together these could be used to perform supervised learning, or to evaluate unsupervised learning techniques.
+Proposed by: [[User:Wcohen|William Cohen]]

Difference between revisions of "Mapping entity names in a document to places on a map"

Latest revision as of 10:32, 8 September 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools