Universal Reordering via Linguistic Typology

Joachim Daiber, Miloš Stanojević, Khalil Sima’an


Abstract
In this paper we explore the novel idea of building a single universal reordering model from English to a large number of target languages. To build this model we exploit typological features of word order for a large number of target languages together with source (English) syntactic features and we train this model on a single combined parallel corpus representing all (22) involved language pairs. We contribute experimental evidence for the usefulness of linguistically defined typological features for building such a model. When the universal reordering model is used for preordering followed by monotone translation (no reordering inside the decoder), our experiments show that this pipeline gives comparable or improved translation performance with a phrase-based baseline for a large number of language pairs (12 out of 22) from diverse language families.
Anthology ID:
C16-1298
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
3167–3176
Language:
URL:
https://aclanthology.org/C16-1298
DOI:
Bibkey:
Cite (ACL):
Joachim Daiber, Miloš Stanojević, and Khalil Sima’an. 2016. Universal Reordering via Linguistic Typology. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 3167–3176, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Universal Reordering via Linguistic Typology (Daiber et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1298.pdf