Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation

El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Hasan Cavusoglu


Abstract
We describe our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE). We view MT models at various training stages (i.e., checkpoints) as human learners at different levels. Hence, we employ an ensemble of multi-checkpoints from the same model to generate translation sequences with various levels of fluency. From each checkpoint, for our best model, we sample n-Best sequences (n=10) with a beam width =100. We achieve an 37.57 macro F1 with a 6 checkpoint model ensemble on the official shared task test data, outperforming a baseline Amazon translation system of 21.30 macro F1 and ultimately demonstrating the utility of our intuitive method.
Anthology ID:
2020.ngt-1.20
Volume:
Proceedings of the Fourth Workshop on Neural Generation and Translation
Month:
July
Year:
2020
Address:
Online
Editors:
Alexandra Birch, Andrew Finch, Hiroaki Hayashi, Kenneth Heafield, Marcin Junczys-Dowmunt, Ioannis Konstas, Xian Li, Graham Neubig, Yusuke Oda
Venue:
NGT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
169–177
Language:
URL:
https://aclanthology.org/2020.ngt-1.20
DOI:
10.18653/v1/2020.ngt-1.20
Bibkey:
Cite (ACL):
El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, and Hasan Cavusoglu. 2020. Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation. In Proceedings of the Fourth Workshop on Neural Generation and Translation, pages 169–177, Online. Association for Computational Linguistics.
Cite (Informal):
Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation (Nagoudi et al., NGT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.ngt-1.20.pdf
Video:
 http://slideslive.com/38929834