Difference between revisions of "Ancora"

Revision as of 19:38, 24 September 2011

Dataset AnCora consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:

Lemma and Part of Speech
Syntactic constituents and functions
Argument structure and thematic roles
Semantic classes of the verb
Denotative type of deverbal nouns
Nouns related to WordNet synsets
Named Entities
Coreference relations

AnCora corpus is mainly based on journalist texts.

The corpus website is [1].

Revision as of 19:38, 24 September 2011 (view source) Ysim (talk \| contribs) ← Older edit		Revision as of 19:38, 24 September 2011 (view source) Ysim (talk \| contribs) Newer edit →
Line 1:		Line 1:
−	[[Category::Dataset~~\|dataset~~]]	+	[[Category::Dataset]]
	AnCora consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:		AnCora consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:

Difference between revisions of "Ancora"

Revision as of 19:38, 24 September 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools