SSNCSE_NLP@LT-EDI-ACL2022: Speech Recognition for Vulnerable Individuals in Tamil using pre-trained XLSR models

Dhanya Srinivasan, Bharathi B, Thenmozhi Durairaj, Senthil Kumar B


Abstract
Automatic speech recognition is a tool used to transform human speech into a written form. It is used in a variety of avenues, such as in voice commands, customer, service and more. It has emerged as an essential tool in the digitisation of daily life. It has been known to be of vital importance in making the lives of elderly and disabled people much easier. In this paper we describe an automatic speech recognition model, determined by using three pre-trained models, fine-tuned from the Facebook XLSR Wav2Vec2 model, which was trained using the Common Voice Dataset. The best model for speech recognition in Tamil is determined by finding the word error rate of the data. This work explains the submission made by SSNCSE_NLP in the shared task organized by LT-EDI at ACL 2022. A word error rate of 39.4512 is achieved.
Anthology ID:
2022.ltedi-1.48
Volume:
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venues:
ACL | LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
317–320
Language:
URL:
https://aclanthology.org/2022.ltedi-1.48
DOI:
10.18653/v1/2022.ltedi-1.48
Bibkey:
Cite (ACL):
Dhanya Srinivasan, Bharathi B, Thenmozhi Durairaj, and Senthil Kumar B. 2022. SSNCSE_NLP@LT-EDI-ACL2022: Speech Recognition for Vulnerable Individuals in Tamil using pre-trained XLSR models. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 317–320, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
SSNCSE_NLP@LT-EDI-ACL2022: Speech Recognition for Vulnerable Individuals in Tamil using pre-trained XLSR models (Srinivasan et al., LTEDI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ltedi-1.48.pdf