Towards Readability-Controlled Machine Translation of COVID-19 Texts

Fernando Alva-Manchego, Matthew Shardlow


Abstract
This project investigates the capabilities of Machine Translation models for generating translations at varying levels of readability, focusing on texts related to COVID-19. Whilst it is possible to automatically translate this information, the resulting text may contain specialised terminology, or may be written in a style that is difficult for lay readers to understand. So far, we have collected a new dataset with manual simplifications for English and Spanish sentences in the TICO-19 dataset, as well as implemented baseline pipelines combining Machine Translation and Text Simplification models.
Anthology ID:
2022.eamt-1.33
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
287–288
Language:
URL:
https://aclanthology.org/2022.eamt-1.33
DOI:
Bibkey:
Cite (ACL):
Fernando Alva-Manchego and Matthew Shardlow. 2022. Towards Readability-Controlled Machine Translation of COVID-19 Texts. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 287–288, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
Towards Readability-Controlled Machine Translation of COVID-19 Texts (Alva-Manchego & Shardlow, EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.33.pdf