Semantic Structural Decomposition for Neural Machine Translation

Elior Sulem, Omri Abend, Ari Rappoport


Abstract
Building on recent advances in semantic parsing and text simplification, we investigate the use of semantic splitting of the source sentence as preprocessing for machine translation. We experiment with a Transformer model and evaluate using large-scale crowd-sourcing experiments. Results show a significant increase in fluency on long sentences on an English-to- French setting with a training corpus of 5M sentence pairs, while retaining comparable adequacy. We also perform a manual analysis which explores the tradeoff between adequacy and fluency in the case where all sentence lengths are considered.
Anthology ID:
2020.starsem-1.6
Volume:
Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Iryna Gurevych, Marianna Apidianaki, Manaal Faruqui
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
50–57
Language:
URL:
https://aclanthology.org/2020.starsem-1.6
DOI:
Bibkey:
Cite (ACL):
Elior Sulem, Omri Abend, and Ari Rappoport. 2020. Semantic Structural Decomposition for Neural Machine Translation. In Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, pages 50–57, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Semantic Structural Decomposition for Neural Machine Translation (Sulem et al., *SEM 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.starsem-1.6.pdf
Code
 eliorsulem/semantic-structural-decomposition-for-nmt
Data
WikiSplit