Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language

Sanja Štajner, Maja Popović


Abstract
In this work, we investigate the possibility of using fully automatic text simplification system on the English source in machine translation (MT) for improving its translation into an under-resourced language. We use the state-of-the-art automatic text simplification (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). We explore three different scenarios for using the ATS in MT: (1) using the raw output of the ATS; (2) automatically filtering out the sentences with low grammaticality and meaning preservation scores; and (3) performing a minimal manual correction of the ATS output. Our results show improvement in fluency of the translation regardless of the chosen scenario, and difference in success of the three scenarios depending on the MT approach used (PBMT or NMT) with regards to improving translation fluency and post-editing effort.
Anthology ID:
R19-1131
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1141–1150
Language:
URL:
https://aclanthology.org/R19-1131/
DOI:
10.26615/978-954-452-056-4_131
Bibkey:
Cite (ACL):
Sanja Štajner and Maja Popović. 2019. Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1141–1150, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language (Štajner & Popović, RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/R19-1131.pdf