Towards Personalised and Document-level Machine Translation of Dialogue

Sebastian Vincent

doi:10.18653/v1/2021.eacl-srw.19

Towards Personalised and Document-level Machine Translation of Dialogue

Abstract

State-of-the-art (SOTA) neural machine translation (NMT) systems translate texts at sentence level, ignoring context: intra-textual information, like the previous sentence, and extra-textual information, like the gender of the speaker. As a result, some sentences are translated incorrectly. Personalised NMT (PersNMT) and document-level NMT (DocNMT) incorporate this information into the translation process. Both fields are relatively new and previous work within them is limited. Moreover, there are no readily available robust evaluation metrics for them, which makes it difficult to develop better systems, as well as track global progress and compare different methods. This thesis proposal focuses on PersNMT and DocNMT for the domain of dialogue extracted from TV subtitles in five languages: English, Brazilian Portuguese, German, French and Polish. Three main challenges are addressed: (1) incorporating extra-textual information directly into NMT systems; (2) improving the machine translation of cohesion devices; (3) reliable evaluation for PersNMT and DocNMT.

Anthology ID:: 2021.eacl-srw.19
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:: April
Year:: 2021
Address:: Online
Editors:: Ionut-Teodor Sorodoc, Madhumita Sushil, Ece Takmaz, Eneko Agirre
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 137–147
Language:
URL:: https://aclanthology.org/2021.eacl-srw.19/
DOI:: 10.18653/v1/2021.eacl-srw.19
Bibkey:
Cite (ACL):: Sebastian Vincent. 2021. Towards Personalised and Document-level Machine Translation of Dialogue. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 137–147, Online. Association for Computational Linguistics.
Cite (Informal):: Towards Personalised and Document-level Machine Translation of Dialogue (Vincent, EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-srw.19.pdf

PDF Cite Search Fix data