Modelling pronominal anaphora in statistical machine translation

Christian Hardmeier, Marcello Federico


Abstract
Current Statistical Machine Translation (SMT) systems translate texts sentence by sentence without considering any cross-sentential context. Assuming independence between sentences makes it difficult to take certain translation decisions when the necessary information cannot be determined locally. We argue for the necessity to include crosssentence dependencies in SMT. As a case in point, we study the problem of pronominal anaphora translation by manually evaluating German-English SMT output. We then present a word dependency model for SMT, which can represent links between word pairs in the same or in different sentences. We use this model to integrate the output of a coreference resolution system into English-German SMT with a view to improving the translation of anaphoric pronouns.
Anthology ID:
2010.iwslt-papers.10
Volume:
Proceedings of the 7th International Workshop on Spoken Language Translation: Papers
Month:
December 2-3
Year:
2010
Address:
Paris, France
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
283–289
Language:
URL:
https://aclanthology.org/2010.iwslt-papers.10
DOI:
Bibkey:
Cite (ACL):
Christian Hardmeier and Marcello Federico. 2010. Modelling pronominal anaphora in statistical machine translation. In Proceedings of the 7th International Workshop on Spoken Language Translation: Papers, pages 283–289, Paris, France.
Cite (Informal):
Modelling pronominal anaphora in statistical machine translation (Hardmeier & Federico, IWSLT 2010)
Copy Citation:
PDF:
https://aclanthology.org/2010.iwslt-papers.10.pdf