Linearizing Dependency Trees

From Cohen Courses
Jump to navigationJump to search

The task

Given a set of dependencies between words in a sentence, predict the linear sequence of words. This is the inverse problem of dependency parsing, where a sequence of words is given and the dependencies are found.

The dataset

There will be two datasets.

The first dataset is the English side of the FBIS Chinese-English dataset, parsed with the Stanford dependency parser. One thousand lines will be held-out during training for evaluation.

The second data set will be a subset of the English Gigaword corpus, also parsed with the Stanford dependency parser.

Evaluation

The output sentence will be evaluated using the BLEU mectric, with the original sentence as the reference.

The Team

This is a one-man team consisting of Jeff Flanigan.