Quality Estimation without Human-labeled Data

Yi-Lin Tuan; Ahmed El-Kishky; Adithya Renduchintala; Vishrav Chaudhary; Francisco Guzmán; Lucia Specia

doi:10.18653/v1/2021.eacl-main.50

Quality Estimation without Human-labeled Data

Yi-Lin Tuan, Ahmed El-Kishky, Adithya Renduchintala, Vishrav Chaudhary, Francisco Guzmán, Lucia Specia

Abstract

Quality estimation aims to measure the quality of translated content without access to a reference translation. This is crucial for machine translation systems in real-world scenarios where high-quality translation is needed. While many approaches exist for quality estimation, they are based on supervised machine learning requiring costly human labelled data. As an alternative, we propose a technique that does not rely on examples from human-annotators and instead uses synthetic training data. We train off-the-shelf architectures for supervised quality estimation on our synthetic data and show that the resulting models achieve comparable performance to models trained on human-annotated data, both for sentence and word-level prediction.

Anthology ID:: 2021.eacl-main.50
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:: April
Year:: 2021
Address:: Online
Editors:: Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 619–625
Language:
URL:: https://aclanthology.org/2021.eacl-main.50/
DOI:: 10.18653/v1/2021.eacl-main.50
Bibkey:
Cite (ACL):: Yi-Lin Tuan, Ahmed El-Kishky, Adithya Renduchintala, Vishrav Chaudhary, Francisco Guzmán, and Lucia Specia. 2021. Quality Estimation without Human-labeled Data. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 619–625, Online. Association for Computational Linguistics.
Cite (Informal):: Quality Estimation without Human-labeled Data (Tuan et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-main.50.pdf
Data: WikiMatrix

PDF Cite Search Fix data