Multitask Models for Controlling the Complexity of Neural Machine Translation

Sweta Agrawal, Marine Carpuat


Abstract
We introduce a machine translation task where the output is aimed at audiences of different levels of target language proficiency. We collect a novel dataset of news articles available in English and Spanish and written for diverse reading grade levels. We leverage this dataset to train multitask sequence to sequence models that translate Spanish into English targeted at an easier reading grade level than the original Spanish. We show that multitask models outperform pipeline approaches that translate and simplify text independently.
Anthology ID:
2020.winlp-1.36
Volume:
Proceedings of the Fourth Widening Natural Language Processing Workshop
Month:
July
Year:
2020
Address:
Seattle, USA
Editors:
Rossana Cunha, Samira Shaikh, Erika Varis, Ryan Georgi, Alicia Tsai, Antonios Anastasopoulos, Khyathi Raghavi Chandu
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
136–139
Language:
URL:
https://aclanthology.org/2020.winlp-1.36
DOI:
10.18653/v1/2020.winlp-1.36
Bibkey:
Cite (ACL):
Sweta Agrawal and Marine Carpuat. 2020. Multitask Models for Controlling the Complexity of Neural Machine Translation. In Proceedings of the Fourth Widening Natural Language Processing Workshop, pages 136–139, Seattle, USA. Association for Computational Linguistics.
Cite (Informal):
Multitask Models for Controlling the Complexity of Neural Machine Translation (Agrawal & Carpuat, WiNLP 2020)
Copy Citation:
Video:
 http://slideslive.com/38929576