Discontinuous Verb Phrases in Parsing and Machine Translation of English and German

Sharid Loáiciga, Kristina Gulordava


Abstract
In this paper, we focus on the verb-particle (V-Prt) split construction in English and German and its difficulty for parsing and Machine Translation (MT). For German, we use an existing test suite of V-Prt split constructions, while for English, we build a new and comparable test suite from raw data. These two data sets are then used to perform an analysis of errors in dependency parsing, word-level alignment and MT, which arise from the discontinuous order in V-Prt split constructions. In the automatic alignments of parallel corpora, most of the particles align to NULL. These mis-alignments and the inability of phrase-based MT system to recover discontinuous phrases result in low quality translations of V-Prt split constructions both in English and German. However, our results show that the V-Prt split phrases are correctly parsed in 90% of cases, suggesting that syntactic-based MT should perform better on these constructions. We evaluate a syntactic-based MT system on German and compare its performance to the phrase-based system.
Anthology ID:
L16-1453
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2839–2845
Language:
URL:
https://aclanthology.org/L16-1453
DOI:
Bibkey:
Cite (ACL):
Sharid Loáiciga and Kristina Gulordava. 2016. Discontinuous Verb Phrases in Parsing and Machine Translation of English and German. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2839–2845, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Discontinuous Verb Phrases in Parsing and Machine Translation of English and German (Loáiciga & Gulordava, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1453.pdf