Multiloop Incremental Bootstrapping for Low-Resource Machine Translation

Wuying Liu, Wei Li, Lin Wang


Abstract
Due to the scarcity of high-quality bilingual sentence pairs, some deep-learning-based machine translation algorithms cannot achieve better performance in low-resource machine translation. On this basis, we are committed to integrating the ideas of machine learning algorithm improvement and data augmentation, propose a novel multiloop incremental bootstrapping framework, and design the corresponding semi-supervised learning algorithm. This framework is a meta-frame independent of specific machine translation algorithms. This algorithm makes full use of bilingual seed data of appropriate scale and super-large-scale monolingual data to expand bilingual sentence pair data incrementally, and trains machine translation models step by step to improve the translation quality. The experimental results of neural machine translation on multiple language pairs prove that our proposed framework can make use of continuous monolingual data to raise itself. Its effectiveness is not only reflected in the easy implementation of state-of-the-art low-resource machine translation, but also in the practical option to quickly establish precise domain machine translation systems.
Anthology ID:
2023.mtsummit-research.1
Volume:
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Masao Utiyama, Rui Wang
Venue:
MTSummit
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/2023.mtsummit-research.1
DOI:
Bibkey:
Cite (ACL):
Wuying Liu, Wei Li, and Lin Wang. 2023. Multiloop Incremental Bootstrapping for Low-Resource Machine Translation. In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pages 1–11, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
Multiloop Incremental Bootstrapping for Low-Resource Machine Translation (Liu et al., MTSummit 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mtsummit-research.1.pdf