Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR

Abhishek Gupta; Amruta Parulekar; Sameep Chattopadhyay; Preethi Jyothi

Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR

Abhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, Preethi Jyothi

Abstract

Automatic speech recognition (ASR) for low-resource languages remains a challenge due to the scarcity of labeled training data. Parameter-efficient fine-tuning and text-only adaptation are two popular methods that have been used to address such low-resource settings. In this work, we investigate how these techniques can be effectively combined using a multilingual multimodal model like SeamlessM4T. Multimodal models are able to leverage unlabeled text via text-only adaptation with further parameter-efficient ASR fine-tuning, thus boosting ASR performance. We also show cross-lingual transfer from a high-resource language, achieving up to a relative 17% WER reduction over baseline in an extremely low-resource setting without any labeled speech.

Anthology ID:: 2024.mrl-1.13
Volume:: Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Jonne Sälevä, Abraham Owodunni
Venue:: MRL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 175–185
Language:
URL:: https://aclanthology.org/2024.mrl-1.13
DOI:
Bibkey:
Cite (ACL):: Abhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, and Preethi Jyothi. 2024. Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR. In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), pages 175–185, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR (Gupta et al., MRL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.mrl-1.13.pdf

PDF Cite Search