Palabras: Crowdsourcing Transcriptions of L2 Speech

Eric Sanders, Pepi Burgos, Catia Cucchiarini, Roeland van Hout


Abstract
We developed a web application for crowdsourcing transcriptions of Dutch words spoken by Spanish L2 learners. In this paper we discuss the design of the application and the influence of metadata and various forms of feedback. Useful data were obtained from 159 participants, with an average of over 20 transcriptions per item, which seems a satisfactory result for this type of research. Informing participants about how many items they still had to complete, and not how many they had already completed, turned to be an incentive to do more items. Assigning participants a score for their performance made it more attractive for them to carry out the transcription task, but this seemed to influence their performance. We discuss possible advantages and disadvantages in connection with the aim of the research and consider possible lessons for designing future experiments.
Anthology ID:
L16-1508
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3186–3191
Language:
URL:
https://aclanthology.org/L16-1508
DOI:
Bibkey:
Cite (ACL):
Eric Sanders, Pepi Burgos, Catia Cucchiarini, and Roeland van Hout. 2016. Palabras: Crowdsourcing Transcriptions of L2 Speech. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3186–3191, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Palabras: Crowdsourcing Transcriptions of L2 Speech (Sanders et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1508.pdf