Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements

Marco Turchi; Matteo Negri

Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements

Abstract

The automatic estimation of machine translation (MT) output quality is an active research area due to its many potential applications (e.g. aiding human translation and post-editing, re-ranking MT hypotheses, MT system combination). Current approaches to the task rely on supervised learning methods for which high-quality labelled data is fundamental. In this framework, quality estimation (QE) has been mainly addressed as a regression problem where models trained on (source, target) sentence pairs annotated with continuous scores (in the [0-1] interval) are used to assign quality scores (in the same interval) to unseen data. Such definition of the problem assumes that continuous scores are informative and easily interpretable by different users. These assumptions, however, conflict with the subjectivity inherent to human translation and evaluation. On one side, the subjectivity of human judgements adds noise and biases to annotations based on scaled values. This problem reduces the usability of the resulting datasets, especially in application scenarios where a sharp distinction between good and bad translations is needed. On the other side, continuous scores are not always sufficient to decide whether a translation is actually acceptable or not. To overcome these issues, we present an automatic method for the annotation of (source, target) pairs with binary judgements that reflect an empirical, and easily interpretable notion of quality. The method is applied to annotate with binary judgements three QE datasets for different language combinations. The three datasets are combined in a single resource, called BinQE, which can be freely downloaded from http://hlt.fbk.eu/technologies/binqe.

Anthology ID:: L14-1400
Volume:: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:: May
Year:: 2014
Address:: Reykjavik, Iceland
Editors:: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:: 1788–1792
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/473_Paper.pdf
DOI:
Bibkey:
Cite (ACL):: Marco Turchi and Matteo Negri. 2014. Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1788–1792, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):: Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements (Turchi & Negri, LREC 2014)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/473_Paper.pdf

PDF Cite Search Fix data