Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification

Simone Hantke, Erik Marchi, Björn Schuller


Abstract
Crowdsourcing is an arising collaborative approach applicable among many other applications to the area of language and speech processing. In fact, the use of crowdsourcing was already applied in the field of speech processing with promising results. However, only few studies investigated the use of crowdsourcing in computational paralinguistics. In this contribution, we propose a novel evaluator for crowdsourced-based ratings termed Weighted Trustability Evaluator (WTE) which is computed from the rater-dependent consistency over the test questions. We further investigate the reliability of crowdsourced annotations as compared to the ones obtained with traditional labelling procedures, such as constrained listening experiments in laboratories or in controlled environments. This comparison includes an in-depth analysis of obtainable classification performances. The experiments were conducted on the Speaker Likability Database (SLD) already used in the INTERSPEECH Challenge 2012, and the results lend further weight to the assumption that crowdsourcing can be applied as a reliable annotation source for computational paralinguistics given a sufficient number of raters and suited measurements of their reliability.
Anthology ID:
L16-1342
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2156–2161
Language:
URL:
https://aclanthology.org/L16-1342
DOI:
Bibkey:
Cite (ACL):
Simone Hantke, Erik Marchi, and Björn Schuller. 2016. Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2156–2161, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification (Hantke et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1342.pdf