Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

Xiangyu Duan; Baijun Ji; Hao Jia; Min Tan; Min Zhang; Boxing Chen; Weihua Luo; Yue Zhang

doi:10.18653/v1/2020.acl-main.143

Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

Xiangyu Duan, Baijun Ji, Hao Jia, Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang

Abstract

In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary. Motivated by the ability of a monolingual speaker learning to translate via looking up the bilingual dictionary, we propose the task to see how much potential an MT system can attain using the bilingual dictionary and large scale monolingual corpora, while is independent on parallel sentences. We propose anchored training (AT) to tackle the task. AT uses the bilingual dictionary to establish anchoring points for closing the gap between source language and target language. Experiments on various language pairs show that our approaches are significantly better than various baselines, including dictionary-based word-by-word translation, dictionary-supervised cross-lingual word embedding transformation, and unsupervised MT. On distant language pairs that are hard for unsupervised MT to perform well, AT performs remarkably better, achieving performances comparable to supervised SMT trained on more than 4M parallel sentences.

Anthology ID:: 2020.acl-main.143
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1570–1579
Language:
URL:: https://aclanthology.org/2020.acl-main.143/
DOI:: 10.18653/v1/2020.acl-main.143
Bibkey:
Cite (ACL):: Xiangyu Duan, Baijun Ji, Hao Jia, Min Tan, Min Zhang, Boxing Chen, Weihua Luo, and Yue Zhang. 2020. Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1570–1579, Online. Association for Computational Linguistics.
Cite (Informal):: Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences (Duan et al., ACL 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.acl-main.143.pdf
Video:: http://slideslive.com/38929017

PDF Cite Search Video Fix data