Difference between revisions of "Wikipedia Infobox Generator Using Cross Lingual Unstructured Text"
From Cohen Courses
Jump to navigationJump to searchLine 1: | Line 1: | ||
− | == Wikipedia Infobox Generator | + | == Wikipedia Infobox Generator By Combining Multi Lingual Unstructured Text == |
=== Team Members === | === Team Members === | ||
Line 16: | Line 16: | ||
=== Reference Papers === | === Reference Papers === | ||
− | (2007) Wu, Weld. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management CIKM 07 | + | * (2007) Wu, Weld. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management CIKM 07 |
Revision as of 12:12, 8 September 2011
Contents
Wikipedia Infobox Generator By Combining Multi Lingual Unstructured Text
Team Members
- Anirudh Koul
- Daegun Won
- Tony Navas
Project Idea
Corpus
- Wikipedia XML Dumps (Current Revision only)
- http://en.wikipedia.org/wiki/Wikipedia_database#Other_languages
- English corpus size - 31 GB Uncompressed
- With 5 languages, approximately 200 GB total
Reference Papers
- (2007) Wu, Weld. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management CIKM 07