Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model

Seiichiro Kondo, Aomi Koyama, Tomoshige Kiyuna, Tosho Hirasawa, Mamoru Komachi


Abstract
We introduce our TMU Japanese-to-English system, which employs a semi-autoregressive model, to tackle the WAT 2021 restricted translation task. In this task, we translate an input sentence with the constraint that some words, called restricted target vocabularies (RTVs), must be contained in the output sentence. To satisfy this constraint, we use a semi-autoregressive model, namely, RecoverSAT, due to its ability (known as “forced translation”) to insert specified words into the output sentence. When using “forced translation,” the order of inserting RTVs is a critical problem. In this work, we aligned the source sentence and the corresponding RTVs using GIZA++. In our system, we obtain word alignment between a source sentence and the corresponding RTVs and then sort the RTVs in the order of their corresponding words or phrases in the source sentence. Using the model with sorted order RTVs, we succeeded in inserting all the RTVs into output sentences in more than 96% of the test sentences. Moreover, we confirmed that sorting RTVs improved the BLEU score compared with random order RTVs.
Anthology ID:
2021.wat-1.5
Volume:
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Toshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
Venue:
WAT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–73
Language:
URL:
https://aclanthology.org/2021.wat-1.5
DOI:
10.18653/v1/2021.wat-1.5
Bibkey:
Cite (ACL):
Seiichiro Kondo, Aomi Koyama, Tomoshige Kiyuna, Tosho Hirasawa, and Mamoru Komachi. 2021. Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 68–73, Online. Association for Computational Linguistics.
Cite (Informal):
Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model (Kondo et al., WAT 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wat-1.5.pdf
Data
ASPEC