A Systematic Study Reveals Unexpected Interactions in Pre-Trained Neural Machine Translation

Ashleigh Richardson, Janet Wiles


Abstract
A significant challenge in developing translation systems for the world’s ∼7,000 languages is that very few have sufficient data for state-of-the-art techniques. Transfer learning is a promising direction for low-resource neural machine translation (NMT), but introduces many new variables which are often selected through ablation studies, costly trial-and-error, or niche expertise. When pre-training an NMT system for low-resource translation, the pre-training task is often chosen based on data abundance and similarity to the main task. Factors such as dataset sizes and similarity have typically been analysed independently in previous studies, due to the computational cost associated with systematic studies. However, these factors are not independent. We conducted a three-factor experiment to examine how language similarity, pre-training dataset size and main dataset size interacted in their effect on performance in pre-trained transformer-based low-resource NMT. We replicated the common finding that more data was beneficial in bilingual systems, but also found a statistically significant interaction between the three factors, which reduced the effectiveness of large pre-training datasets for some main task dataset sizes (p-value < 0.0018). The surprising trends identified in these interactions indicate that systematic studies of interactions may be a promising long-term direction for guiding research in low-resource neural methods.
Anthology ID:
2022.lrec-1.154
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1437–1443
Language:
URL:
https://aclanthology.org/2022.lrec-1.154
DOI:
Bibkey:
Cite (ACL):
Ashleigh Richardson and Janet Wiles. 2022. A Systematic Study Reveals Unexpected Interactions in Pre-Trained Neural Machine Translation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1437–1443, Marseille, France. European Language Resources Association.
Cite (Informal):
A Systematic Study Reveals Unexpected Interactions in Pre-Trained Neural Machine Translation (Richardson & Wiles, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.154.pdf