Rule-based Reordering Space in Statistical Machine Translation

Nicolas Pécheux, Alexander Allauzen, François Yvon


Abstract
In Statistical Machine Translation (SMT), the constraints on word reorderings have a great impact on the set of potential translations that are explored. Notwithstanding computationnal issues, the reordering space of a SMT system needs to be designed with great care: if a larger search space is likely to yield better translations, it may also lead to more decoding errors, because of the added ambiguity and the interaction with the pruning strategy. In this paper, we study this trade-off using a state-of-the art translation system, where all reorderings are represented in a word lattice prior to decoding. This allows us to directly explore and compare different reordering spaces. We study in detail a rule-based preordering system, varying the length or number of rules, the tagset used, as well as contrasting with oracle settings and purely combinatorial subsets of permutations. We focus on two language pairs: English-French, a close language pair and English-German, known to be a more challenging reordering pair.
Anthology ID:
L14-1575
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1800–1806
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/735_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Nicolas Pécheux, Alexander Allauzen, and François Yvon. 2014. Rule-based Reordering Space in Statistical Machine Translation. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1800–1806, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Rule-based Reordering Space in Statistical Machine Translation (Pécheux et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/735_Paper.pdf