Gideon Kotzé


pdf bib
Large aligned treebanks for syntax-based machine translation
Gideon Kotzé | Vincent Vandeghinste | Scott Martens | Jörg Tiedemann
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the nonterminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present evaluation scores of both the nonterminal constituent alignments and the MT system itself, and in the latter case, compare them with those of Moses, a current state-of-the-art statistical MT system, when trained on the same data.


pdf bib
Finding statistically motivated features influencing subtree alignment performance
Gideon Kotzé
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)


pdf bib
A Discriminative Approach to Tree Alignment
Jörg Tiedemann | Gideon Kotzé
Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning