What is lost in Normalization? Exploring Pitfalls in Multilingual ASR Model Evaluations

Kavya Manohar, Leena Pillai


Abstract
This paper explores the pitfalls in evaluating multilingual automatic speech recognition (ASR) models, with a particular focus on Indic language scripts. We investigate the text normalization routine employed by leading ASR models, including OpenAI Whisper, Meta’s MMS, Seamless, and Assembly AI’s Conformer, and their unintended consequences on performance metrics. Our research reveals that current text normalization practices, while aiming to standardize ASR outputs for fair comparison, by removing inconsistencies such as variations in spelling, punctuation, and special characters, are fundamentally flawed when applied to Indic scripts. Through empirical analysis using text similarity scores and in-depth linguistic examination, we demonstrate that these flaws lead to artificially improved performance metrics for Indic languages. We conclude by proposing a shift towards developing text normalization routines that leverage native linguistic expertise, ensuring more robust and accurate evaluations of multilingual ASR models.
Anthology ID:
2024.emnlp-main.607
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10864–10869
Language:
URL:
https://aclanthology.org/2024.emnlp-main.607
DOI:
Bibkey:
Cite (ACL):
Kavya Manohar and Leena Pillai. 2024. What is lost in Normalization? Exploring Pitfalls in Multilingual ASR Model Evaluations. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10864–10869, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
What is lost in Normalization? Exploring Pitfalls in Multilingual ASR Model Evaluations (Manohar & Pillai, EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.607.pdf