Translation Memories as Baselines for Low-Resource Machine Translation

Rebecca Knowles, Patrick Littell


Abstract
Low-resource machine translation research often requires building baselines to benchmark estimates of progress in translation quality. Neural and statistical phrase-based systems are often used with out-of-the-box settings to build these initial baselines before analyzing more sophisticated approaches, implicitly comparing the first machine translation system to the absence of any translation assistance. We argue that this approach overlooks a basic resource: if you have parallel text, you have a translation memory. In this work, we show that using available text as a translation memory baseline against which to compare machine translation systems is simple, effective, and can shed light on additional translation challenges.
Anthology ID:
2022.lrec-1.728
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6759–6767
Language:
URL:
https://aclanthology.org/2022.lrec-1.728
DOI:
Bibkey:
Cite (ACL):
Rebecca Knowles and Patrick Littell. 2022. Translation Memories as Baselines for Low-Resource Machine Translation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6759–6767, Marseille, France. European Language Resources Association.
Cite (Informal):
Translation Memories as Baselines for Low-Resource Machine Translation (Knowles & Littell, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.728.pdf