Difference between revisions of "Linearizing Dependency Trees"

From Cohen Courses
Jump to navigationJump to search
Line 10: Line 10:
  
 
The second data set will be a subset of the English Gigaword corpus, also parsed with the Stanford dependency parser.
 
The second data set will be a subset of the English Gigaword corpus, also parsed with the Stanford dependency parser.
 +
 +
'''Evaluation'''
 +
 +
The output sentence will be evaluated using the BLEU mectric, with the original sentence as the reference.
  
 
'''The Team'''
 
'''The Team'''
  
 
This is a one-man team consisting of [[User:Jmflanig| Jeff Flanigan]].
 
This is a one-man team consisting of [[User:Jmflanig| Jeff Flanigan]].

Revision as of 00:29, 13 September 2011

The task

Given a set of dependencies between words in a sentence, predict the linear sequence of words. This is the inverse problem of dependency parsing, where a sequence of words is given and the dependencies are found.

The dataset

There will be two datasets.

The first dataset is the English side of the FBIS Chinese-English dataset, parsed with the Stanford dependency parser. One thousand lines will be held-out during training for evaluation.

The second data set will be a subset of the English Gigaword corpus, also parsed with the Stanford dependency parser.

Evaluation

The output sentence will be evaluated using the BLEU mectric, with the original sentence as the reference.

The Team

This is a one-man team consisting of Jeff Flanigan.