Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition

Aditya Yadavalli, Ganesh Sai Mirishkar, Anil Vuppala


Abstract
Previous research has found that Acoustic Models (AM) of an Automatic Speech Recognition (ASR) system are susceptible to dialect variations within a language, thereby adversely affecting the ASR. To counter this, researchers have proposed to build a dialect-specific AM while keeping the Language Model (LM) constant for all the dialects. This study explores the effect of dialect mismatched LM by considering three different Telugu regional dialects: Telangana, Coastal Andhra, and Rayalaseema. We show that dialect variations that surface in the form of a different lexicon, grammar, and occasionally semantics can significantly degrade the performance of the LM under mismatched conditions. Therefore, this degradation has an adverse effect on the ASR even when dialect-specific AM is used. We show a degradation of up to 13.13 perplexity points when LM is used under mismatched conditions. Furthermore, we show a degradation of over 9% and over 15% in Character Error Rate (CER) and Word Error Rate (WER), respectively, in the ASR systems when using mismatched LMs over matched LMs.
Anthology ID:
2022.naacl-srw.36
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
Month:
July
Year:
2022
Address:
Hybrid: Seattle, Washington + Online
Editors:
Daphne Ippolito, Liunian Harold Li, Maria Leonor Pacheco, Danqi Chen, Nianwen Xue
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
292–301
Language:
URL:
https://aclanthology.org/2022.naacl-srw.36
DOI:
10.18653/v1/2022.naacl-srw.36
Bibkey:
Cite (ACL):
Aditya Yadavalli, Ganesh Sai Mirishkar, and Anil Vuppala. 2022. Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, pages 292–301, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
Cite (Informal):
Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition (Yadavalli et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-srw.36.pdf
Video:
 https://aclanthology.org/2022.naacl-srw.36.mp4