Difference between revisions of "Ancora"

Latest revision as of 20:39, 24 September 2011

Ancora consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:

Lemma and Part of Speech
Syntactic constituents and functions
Argument structure and thematic roles
Semantic classes of the verb
Denotative type of deverbal nouns
Nouns related to WordNet synsets
Named Entities
Coreference relations

AnCora corpus is mainly based on journalist texts.

The corpus website is [1].

Revision as of 20:37, 24 September 2011 (view source) Ysim (talk \| contribs) ← Older edit		Latest revision as of 20:39, 24 September 2011 (view source) Ysim (talk \| contribs)
(6 intermediate revisions by the same user not shown)
Line 1:		Line 1:
−	~~AnCora~~ [Category:Dataset] consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:	+	[[Category::Dataset\|Ancora]] consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:

	* Lemma and Part of Speech		* Lemma and Part of Speech

Difference between revisions of "Ancora"

Latest revision as of 20:39, 24 September 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools