Difference between revisions of "Wikipedia Infobox Generator Using Cross Lingual Unstructured Text"
From Cohen Courses
Jump to navigationJump to search (Created page with 'Wikipedia Infobox Generator Using Cross Lingual Unstructured Text === Team Members === (Alphabetically) * Anirudh Koul * Daegun Won * Tony Navas === Project Idea === === Cor…') |
|||
Line 1: | Line 1: | ||
− | Wikipedia Infobox Generator Using Cross Lingual Unstructured Text | + | == Wikipedia Infobox Generator Using Cross Lingual Unstructured Text == |
− | === Team Members === | + | === Team Members === |
* Anirudh Koul | * Anirudh Koul | ||
* Daegun Won | * Daegun Won |
Revision as of 12:06, 8 September 2011
Contents
Wikipedia Infobox Generator Using Cross Lingual Unstructured Text
Team Members
- Anirudh Koul
- Daegun Won
- Tony Navas
Project Idea
Corpus
Wikipedia XML Dumps (Current Revision only)
* http://en.wikipedia.org/wiki/Wikipedia_database#Other_languages * English corpus size - 31 GB Uncompressed * With 5 languages, approximately 200 GB total
Reference Papers
(2007) Wu, Weld. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management CIKM 07