MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore

Yingxu He; Zhuohan Liu; Geyu Lin; Shuo Sun; Bin Wang; Wenyu Zhang; Xunlong Zou; Nancy Chen; Aiti Aw

doi:10.18653/v1/2025.acl-demo.3

MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore

Yingxu He, Zhuohan Liu, Geyu Lin, Shuo Sun, Bin Wang, Wenyu Zhang, Xunlong Zou, Nancy F. Chen, AiTi Aw

Abstract

We introduce MERaLiON-AudioLLM, the first general-purpose audio-based large language model designed for multitask learning, with a particular focus on Singlish understanding. Trained on 62 million multimodal instruction samples comprising a total of 260k hours of audio, it exhibits strong generalization across a diverse set of tasks, including—but not limited to—automatic speech recognition, spoken question answering, speech translation, and paralinguistic analysis. Our results show significant improvements in local speech recognition and task-specific understanding, making MERaLiON-AudioLLM a leading solution for region-specific AI applications. An interactive demo has been developed to enable user-friendly interactions, supported by a backend with customized caching and load-balancing mechanisms. We benchmark the model across a broad range of multilingual and multitask scenarios, where it demonstrates competitive performance compared to other open-source models. The demo page, model weights and videos are publically accessible.

Anthology ID:: 2025.acl-demo.3
Original:: 2025.acl-demo.3v1
Version 2:: 2025.acl-demo.3v2
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Pushkar Mishra, Smaranda Muresan, Tao Yu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22–30
Language:
URL:: https://aclanthology.org/2025.acl-demo.3/
DOI:: 10.18653/v1/2025.acl-demo.3
Bibkey:
Cite (ACL):: Yingxu He, Zhuohan Liu, Geyu Lin, Shuo Sun, Bin Wang, Wenyu Zhang, Xunlong Zou, Nancy F. Chen, and AiTi Aw. 2025. MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 22–30, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore (He et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-demo.3.pdf

PDF (v2) PDF (v1) Cite Search Fix data