The AUTONOMATA Spoken Names Corpus

Henk van den Heuvel, Jean-Pierre Martens, Bart D’hoore, Kristof D’hanens, Nanneke Konings


Abstract
In the Autonomata project we have collected a corpus of spoken name utterances with manually corrected phonemic transcriptions of these utterances. The corpus was designed with the intention to become a major resource for the development of automatic speech recognition engines that can achieve a high accuracy on the recognition of person and geographical names spoken in Dutch. The recorded names were selected so as to reveal the major pronunciation variations that a speech recognizer of e.g. a navigation system with speech input is going to be confronted with. This includes native speakers speaking foreign names and vice versa.
Anthology ID:
L08-1476
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/48_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Henk van den Heuvel, Jean-Pierre Martens, Bart D’hoore, Kristof D’hanens, and Nanneke Konings. 2008. The AUTONOMATA Spoken Names Corpus. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
The AUTONOMATA Spoken Names Corpus (van den Heuvel et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/48_paper.pdf