The Database for Spoken German — DGD2

Thomas Schmidt


Abstract
The Database for Spoken German (Datenbank für Gesprochenes Deutsch, DGD2, http://dgd.ids-mannheim.de) is the central platform for publishing and disseminating spoken language corpora from the Archive of Spoken German (Archiv für Gesprochenes Deutsch, AGD, http://agd.ids-mannheim.de) at the Institute for the German Language in Mannheim. The corpora contained in the DGD2 come from a variety of sources, some of them in-house projects, some of them external projects. Most of the corpora were originally intended either for research into the (dialectal) variation of German or for studies in conversation analysis and related fields. The AGD has taken over the task of permanently archiving these resources and making them available for reuse to the research community. To date, the DGD2 offers access to 19 different corpora, totalling around 9000 speech events, 2500 hours of audio recordings or 8 million transcribed words. This paper gives an overview of the data made available via the DGD2, of the technical basis for its implementation, and of the most important functionalities it offers. The paper concludes with information about the users of the database and future plans for its development.
Anthology ID:
L14-1183
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1451–1457
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/171_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Thomas Schmidt. 2014. The Database for Spoken German — DGD2. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1451–1457, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
The Database for Spoken German — DGD2 (Schmidt, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/171_Paper.pdf