The Language Archive — a new hub for language resources

Sebastian Drude, Daan Broeder, Paul Trilsbeek, Peter Wittenburg


Abstract
This contribution presents “The Language Archive” (TLA), a new unit at the MPI for Psycholinguistics, discussing the current developments in management of scientific data, considering the need for new data research infrastructures. Although several initiatives worldwide in the realm of language resources aim at the integration, preservation and mobilization of research data, the state of such scientific data is still often problematic. Data are often not well organized and archived and not described by metadata ― even unique data such as field-work observational data on endangered languages is still mostly on perishable carriers. New data centres are needed that provide trusted, quality-reviewed, persistent services and suitable tools and that take legal and ethical issues seriously. The CLARIN initiative has established criteria for suitable centres. TLA is in a good position to be one of such centres. It is based on three essential pillars: (1) A data archive; (2) management, access and annotation tools; (3) archiving and software expertise for collaborative projects. The archive hosts mostly observational data on small languages worldwide and language acquisition data, but also data resulting from experiments.
Anthology ID:
L12-1530
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3264–3267
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/891_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Sebastian Drude, Daan Broeder, Paul Trilsbeek, and Peter Wittenburg. 2012. The Language Archive — a new hub for language resources. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3264–3267, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
The Language Archive — a new hub for language resources (Drude et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/891_Paper.pdf