On Complex Word Alignment Configurations

Miriam Kaeshammer, Anika Westburg


Abstract
Resources of manual word alignments contain configurations that are beyond the alignment capacity of current translation models, hence the term complex alignment configuration. They have been the matter of some debate in the machine translation community, as they call for more powerful translation models that come with further complications. In this work we investigate instances of complex alignment configurations in data sets of four different language pairs to shed more light on the nature and cause of those configurations. For the English-German alignments from Padó and Lapata (2006), for instance, we find that only a small fraction of the complex configurations are due to real annotation errors. While a third of the complex configurations in this data set could be simplified when annotating according to a different style guide, the remaining ones are phenomena that one would like to be able to generate during translation. Those instances are mainly caused by the different word order of English and German. Our findings thus motivate further research in the area of translation beyond phrase-based and context-free translation modeling.
Anthology ID:
L14-1338
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1773–1780
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/390_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Miriam Kaeshammer and Anika Westburg. 2014. On Complex Word Alignment Configurations. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1773–1780, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
On Complex Word Alignment Configurations (Kaeshammer & Westburg, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/390_Paper.pdf