Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024

Yash Bhaskar, Parameswari Krishnamurthy


Abstract
This paper presents the systems submitted by the Yes-MT team for the Low-Resource Indic Language Translation Shared Task at WMT 2024, focusing on translating between English and the Assamese, Mizo, Khasi, and Manipuri languages. The experiments explored various approaches, including fine-tuning pre-trained models like mT5 and IndicBart in both Multilingual and Monolingual settings, LoRA finetune IndicTrans2, zero-shot and few-shot prompting with large language models (LLMs) like Llama 3 and Mixtral 8x7b, LoRA Supervised Fine Tuning Llama 3, and training Transformers from scratch. The results were evaluated on the WMT23 Low-Resource Indic Language Translation Shared Task’s test data using SacreBLEU and CHRF highlighting the challenges of low-resource translation and show the potential of LLMs for these tasks, particularly with fine-tuning.
Anthology ID:
2024.wmt-1.71
Volume:
Proceedings of the Ninth Conference on Machine Translation
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
788–792
Language:
URL:
https://aclanthology.org/2024.wmt-1.71
DOI:
Bibkey:
Cite (ACL):
Yash Bhaskar and Parameswari Krishnamurthy. 2024. Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024. In Proceedings of the Ninth Conference on Machine Translation, pages 788–792, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024 (Bhaskar & Krishnamurthy, WMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.wmt-1.71.pdf