Expressive hierarchical rule extraction for left-to-right translation

Maryam Siahbani, Anoop Sarkar


Abstract
Left-to-right (LR) decoding Watanabe et al. (2006) is a promising decoding algorithm for hierarchical phrase-based translation (Hiero) that visits input spans in arbitrary order producing the output translation in left to right order. This leads to far fewer language model calls. But the constrained SCFG grammar used in LR-Hiero (GNF) with at most two non-terminals is unable to account for some complex phrasal reordering. Allowing more non-terminals in the rules results in a more expressive grammar. LR-decoding can be used to decode with SCFGs with more than two non-terminals, but the CKY decoders used for Hiero systems cannot deal with such expressive grammars due to a blowup in computational complexity. In this paper we present a dynamic programming algorithm for GNF rule extraction which efficiently extracts sentence level SCFG rule sets with an arbitrary number of non-terminals. We analyze the performance of the obtained grammar for statistical machine translation on three language pairs.
Anthology ID:
2014.amta-researchers.1
Volume:
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Month:
October 22-26
Year:
2014
Address:
Vancouver, Canada
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
1–14
Language:
URL:
https://aclanthology.org/2014.amta-researchers.1
DOI:
Bibkey:
Cite (ACL):
Maryam Siahbani and Anoop Sarkar. 2014. Expressive hierarchical rule extraction for left-to-right translation. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, pages 1–14, Vancouver, Canada. Association for Machine Translation in the Americas.
Cite (Informal):
Expressive hierarchical rule extraction for left-to-right translation (Siahbani & Sarkar, AMTA 2014)
Copy Citation:
PDF:
https://aclanthology.org/2014.amta-researchers.1.pdf