On Mitigating Performance Disparities in Multilingual Speech Recognition

Monorama Swain, Anna Zee, Anders Søgaard


Abstract
How far have we come in mitigating performance disparities across genders in multilingual speech recognition? We compare the impact on gender disparity of different fine-tuning algorithms for automated speech recognition across model sizes, languages and gender. We look at both performance-focused and fairness-promoting algorithms. Across languages, we see slightly better performance for female speakers for larger models regardless of the fine-tuning algorithm. The best trade-off between performance and parity is found using adapter fusion. Fairness-promoting fine-tuning algorithms (Group-DRO and Spectral Decoupling) hurt performance compared to adapter fusion with only slightly better performance parity. LoRA increases disparities slightly. Fairness-mitigating fine-tuning techniques led to slightly higher variance in performance across languages, with the exception of adapter fusion.
Anthology ID:
2024.emnlp-main.323
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5647–5655
Language:
URL:
https://aclanthology.org/2024.emnlp-main.323
DOI:
Bibkey:
Cite (ACL):
Monorama Swain, Anna Zee, and Anders Søgaard. 2024. On Mitigating Performance Disparities in Multilingual Speech Recognition. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5647–5655, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
On Mitigating Performance Disparities in Multilingual Speech Recognition (Swain et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.323.pdf