Group Fairness in Multilingual Speech Recognition Models

Anna Zee, Marc Zee, Anders Søgaard


Abstract
We evaluate the performance disparity of the Whisper and MMS families of ASR models across the VoxPopuli and Common Voice multilingual datasets, with an eye toward intersectionality. Our two most important findings are that model size, surprisingly, correlates logarithmically with worst-case performance disparities, meaning that larger (and better) models are less fair. We also observe the importance of intersectionality. In particular, models often exhibit significant performance disparity across binary gender for adolescents.
Anthology ID:
2024.findings-naacl.143
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2213–2226
Language:
URL:
https://aclanthology.org/2024.findings-naacl.143
DOI:
Bibkey:
Cite (ACL):
Anna Zee, Marc Zee, and Anders Søgaard. 2024. Group Fairness in Multilingual Speech Recognition Models. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2213–2226, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Group Fairness in Multilingual Speech Recognition Models (Zee et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.143.pdf
Copyright:
 2024.findings-naacl.143.copyright.pdf