Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin

Samuel Frontull, Georg Moser


Abstract
This paper explores the impact of different back-translation approaches on machine translation for Ladin, specifically the Val Badia variant. Given the limited amount of parallel data available for this language (only 18k Ladin-Italian sentence pairs), we investigate the performance of a multilingual neural machine translation model fine-tuned for Ladin-Italian. In addition to the available authentic data, we synthesise further translations by using three different models: a fine-tuned neural model, a rule-based system developed specifically for this language pair, and a large language model. Our experiments show that all approaches achieve comparable translation quality in this low-resource scenario, yet round-trip translations highlight differences in model performance.
Anthology ID:
2024.acl-1.13
Volume:
Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
128–138
Language:
URL:
https://aclanthology.org/2024.acl-1.13/
DOI:
10.18653/v1/2024.loresmt-1.13
Bibkey:
Cite (ACL):
Samuel Frontull and Georg Moser. 2024. Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 128–138, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin (Frontull & Moser, LoResMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.loresmt-1.13.pdf