Difference between revisions of "Koo et al, ACL 2008"

From Cohen Courses
Jump to navigationJump to search
Line 7: Line 7:
  
 
==Summary==
 
==Summary==
The authors's experiment is to train a dependency parser. Their improvement is to use word clusters as an additional feature. The word clusters are drawn from the [[UsesDataset::BLLIP]] corpus, which uses the [[UsesMethod::Brown Clustering]] algorithm. They examine (1) the added performance by adding word clusters as features and (2) the amount of of data is needed to achieve the same accuracy without the word cluster features. It turns out that with the cluster features, the parser does statistically significantly better. They also observe that one only needs about half the data when using cluster features to achieve the same accuracy.
+
The authors's experiment is to train a dependency parser. Their improvement is to use word clusters as an additional feature. The word clusters are drawn from the [[UsesDataset::BLLIP]] corpus, which uses the [[UsesMethod::Brown clustering]] algorithm. They examine (1) the added performance by adding word clusters as features and (2) the amount of of data is needed to achieve the same accuracy without the word cluster features. It turns out that with the cluster features, the parser does statistically significantly better. They also observe that one only needs about half the data when using cluster features to achieve the same accuracy.
  
 
==Clustering==
 
==Clustering==
 
The use the [[UsesMethod::Brown clustering]] algorithm to group words into groups. The result is a mapping from words to bit-strings where the closer the bit-strings, the closer the words are. With the bit-strings, we can decide how many clusters to allow by only considering the first k bits. If the tree were perfectly balanced, that would give <math>2^k</math> clusters.
 
The use the [[UsesMethod::Brown clustering]] algorithm to group words into groups. The result is a mapping from words to bit-strings where the closer the bit-strings, the closer the words are. With the bit-strings, we can decide how many clusters to allow by only considering the first k bits. If the tree were perfectly balanced, that would give <math>2^k</math> clusters.

Revision as of 01:11, 30 September 2011

Simple semi-supervised dependency parsing is a paper by Koo, Carreras, and Collins online at [www.cs.columbia.edu/~mcollins/papers/koo08acl.pdf].

Work in progress by Andrew Rodriguez since (9/29)

Citation

Terry Koo, Xavier Carreras, and Michael Collins. Simple semi-supervised dependency parsing. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 595-603, Columbus, Ohio, USA, June 2008.

Summary

The authors's experiment is to train a dependency parser. Their improvement is to use word clusters as an additional feature. The word clusters are drawn from the BLLIP corpus, which uses the Brown clustering algorithm. They examine (1) the added performance by adding word clusters as features and (2) the amount of of data is needed to achieve the same accuracy without the word cluster features. It turns out that with the cluster features, the parser does statistically significantly better. They also observe that one only needs about half the data when using cluster features to achieve the same accuracy.

Clustering

The use the Brown clustering algorithm to group words into groups. The result is a mapping from words to bit-strings where the closer the bit-strings, the closer the words are. With the bit-strings, we can decide how many clusters to allow by only considering the first k bits. If the tree were perfectly balanced, that would give clusters.