ADESSE, a Database with Syntactic and Semantic Annotation of a Corpus of Spanish

José M. García-Miguel, Gael Vaamonde, Fita González Domínguez


Abstract
This is an overall description of ADESSE (""Base de datos de verbos, Alternancias de Diátesis y Esquemas Sintactico-Semánticos del Español""), an online database (http://adesse.uvigo.es/) with syntactic and semantic information for all clauses in a corpus of Spanish. The manually annotated corpus has 1.5 million words, 159,000 clauses and 3,450 different verb lemmas. ADESSE is an expanded version of BDS (""Base de datos sintácticos del español actual""), which contains the grammatical features of verbs and verb-arguments in the corpus. ADESSE has added semantic features such as verb sense, verb class and semantic role of arguments to make possible a detailed syntactic and semantic corpus-based characterization of verb valency. Each verb entry in the database is described in terms of valency potential and valency realizations (diatheses). The former includes a set of semantic roles of participants in a particular event type and a classification into a conceptual hierarchy of process types. Valency realizations are described in terms of correspondences of voice, syntactic functions and categories, and semantic roles. Verbs senses are discriminated at two levels: a more abstract level linked to a valency potential, and more specific verb senses taking into account particular lexical instantiations of arguments.
Anthology ID:
L10-1593
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/859_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
José M. García-Miguel, Gael Vaamonde, and Fita González Domínguez. 2010. ADESSE, a Database with Syntactic and Semantic Annotation of a Corpus of Spanish. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
ADESSE, a Database with Syntactic and Semantic Annotation of a Corpus of Spanish (García-Miguel et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/859_Paper.pdf