Integrating empty category detection into preordering Machine Translation

Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto


Abstract
We propose a method for integrating Japanese empty category detection into the preordering process of Japanese-to-English statistical machine translation. First, we apply machine-learning-based empty category detection to estimate the position and the type of empty categories in the constituent tree of the source sentence. Then, we apply discriminative preordering to the augmented constituent tree in which empty categories are treated as if they are normal lexical symbols. We find that it is effective to filter empty categories based on the confidence of estimation. Our experiments show that, for the IWSLT dataset consisting of short travel conversations, the insertion of empty categories alone improves the BLEU score from 33.2 to 34.3 and the RIBES score from 76.3 to 78.7, which imply that reordering has improved For the KFTT dataset consisting of Wikipedia sentences, the proposed preordering method considering empty categories improves the BLEU score from 19.9 to 20.2 and the RIBES score from 66.2 to 66.3, which shows both translation and reordering have improved slightly.
Anthology ID:
W16-4615
Volume:
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
WAT | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
157–165
Language:
URL:
https://aclanthology.org/W16-4615
DOI:
Bibkey:
Cite (ACL):
Shunsuke Takeno, Masaaki Nagata, and Kazuhide Yamamoto. 2016. Integrating empty category detection into preordering Machine Translation. In Proceedings of the 3rd Workshop on Asian Translation (WAT2016), pages 157–165, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Integrating empty category detection into preordering Machine Translation (Takeno et al., 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4615.pdf