Difference between revisions of "Ancora"

Revision as of 19:37, 24 September 2011

AnCora [Category:Dataset] consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:

Lemma and Part of Speech
Syntactic constituents and functions
Argument structure and thematic roles
Semantic classes of the verb
Denotative type of deverbal nouns
Nouns related to WordNet synsets
Named Entities
Coreference relations

AnCora corpus is mainly based on journalist texts.

The corpus website is [1].

@@ Line 1: / Line 1: @@
-AnCora consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:
+AnCora [Category:Dataset] consist of a Catalan corpus (AnCora-CA) and a Spanish corpus (AnCora-ES), each of them of 500,000 words. The corpora are annotated at different levels:
 * Lemma and Part of Speech
@@ Line 12: / Line 12: @@
 AnCora corpus is mainly based on journalist texts.
-The corpus website is [[http://clic.ub.edu/corpus/en]].
+The corpus website is [http://clic.ub.edu/corpus/en].

Difference between revisions of "Ancora"

Revision as of 19:37, 24 September 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools