Dependency Treelet Translation: The Convergence of Statistical and Example-based Machine-translation?

Arul Menezes, Chris Quirk


Abstract
We describe a novel approach to machine translation that combines the strengths of the two leading corpus-based approaches: Phrasal SMT and EBMT. We use a syntactically informed decoder and reordering model based on the source dependency tree, in combination with conventional SMT models to incorporate the power of phrasal SMT with the linguistic generality available in a parser. We show that this approach significantly outperforms a leading string-based Phrasal SMT decoder and an EBMT system. We present results from two radically different language pairs, and investigate the sensitivity of this approach to parse quality by using two distinct parsers and oracle experiments. We also validate our automated BLEU scores with a small human evaluation.
Anthology ID:
2005.mtsummit-ebmt.13
Volume:
Workshop on example-based machine translation
Month:
September 13-15
Year:
2005
Address:
Phuket, Thailand
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
99–108
Language:
URL:
https://aclanthology.org/2005.mtsummit-ebmt.13
DOI:
Bibkey:
Cite (ACL):
Arul Menezes and Chris Quirk. 2005. Dependency Treelet Translation: The Convergence of Statistical and Example-based Machine-translation?. In Workshop on example-based machine translation, pages 99–108, Phuket, Thailand.
Cite (Informal):
Dependency Treelet Translation: The Convergence of Statistical and Example-based Machine-translation? (Menezes & Quirk, MTSummit 2005)
Copy Citation:
PDF:
https://aclanthology.org/2005.mtsummit-ebmt.13.pdf