Maryam Al Ali
2024
PolyWER: A Holistic Evaluation Framework for Code-Switched Speech Recognition
Karima Kadaoui
|
Maryam Al Ali
|
Hawau Olamide Toyin
|
Ibrahim Mohammed
|
Hanan Aldarmaki
Findings of the Association for Computational Linguistics: EMNLP 2024
Code-switching in speech, particularly between languages that use different scripts, can potentially be correctly transcribed in various forms, including different ways of transliteration of the embedded language into the matrix language script. Traditional methods for measuring accuracy, such as Word Error Rate (WER), are too strict to address this challenge. In this paper, we introduce PolyWER, a proposed framework for evaluating speech recognition systems to handle language-mixing. PolyWER accepts transcriptions of code-mixed segments in different forms, including transliterations and translations. We demonstrate the algorithms use cases through detailed examples, and evaluate it against human judgement. To enable the use of this metric, we appended the annotations of a publicly available Arabic-English code-switched dataset with transliterations and translations of code-mixed speech. We also utilize these additional annotations for fine-tuning ASR models and compare their performance using PolyWER. In addition to our main finding on PolyWER’s effectiveness, our experiments show that alternative annotations could be more effective for fine-tuning monolingual ASR models.
Search