Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation

Tao Wang; Chengqi Zhao; Mingxuan Wang; Lei Li; Deyi Xiong

doi:10.18653/v1/2021.naacl-industry.14

Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation

Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei Li, Deyi Xiong

Abstract

Automatic translation of dialogue texts is a much needed demand in many real life scenarios. However, the currently existing neural machine translation delivers unsatisfying results. In this paper, we conduct a deep analysis of a dialogue corpus and summarize three major issues on dialogue translation, including pronoun dropping (), punctuation dropping (), and typos (). In response to these challenges, we propose a joint learning method to identify omission and typo, and utilize context to translate dialogue utterances. To properly evaluate the performance, we propose a manually annotated dataset with 1,931 Chinese-English parallel utterances from 300 dialogues as a benchmark testbed for dialogue translation. Our experiments show that the proposed method improves translation quality by 3.2 BLEU over the baselines. It also elevates the recovery rate of omitted pronouns from 26.09% to 47.16%. We will publish the code and dataset publicly at https://xxx.xx.

Anthology ID:: 2021.naacl-industry.14
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers
Month:: June
Year:: 2021
Address:: Online
Editors:: Young-bum Kim, Yunyao Li, Owen Rambow
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 105–112
Language:
URL:: https://aclanthology.org/2021.naacl-industry.14/
DOI:: 10.18653/v1/2021.naacl-industry.14
Bibkey:
Cite (ACL):: Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei Li, and Deyi Xiong. 2021. Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, pages 105–112, Online. Association for Computational Linguistics.
Cite (Informal):: Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation (Wang et al., NAACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.naacl-industry.14.pdf
Video:: https://aclanthology.org/2021.naacl-industry.14.mp4

PDF Cite Search Video Fix data