ACTIV-ES: a comparable, cross-dialect corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain

Jerid Francom, Mans Hulden, Adam Ussishkin


Abstract
Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in ‘everyday’ use is informal/ dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the ‘everyday’ language of various regions of the Spanish-speaking world.
Anthology ID:
L14-1543
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1733–1737
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/691_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Jerid Francom, Mans Hulden, and Adam Ussishkin. 2014. ACTIV-ES: a comparable, cross-dialect corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1733–1737, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
ACTIV-ES: a comparable, cross-dialect corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain (Francom et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/691_Paper.pdf