Agent and User-Generated Content and its Impact on Customer Support MT

Madalena Gonçalves, Marianna Buchicchio, Craig Stewart, Helena Moniz, Alon Lavie


Abstract
This paper illustrates a new evaluation framework developed at Unbabel for measuring the quality of source language text and its effect on both Machine Translation (MT) and Human Post-Edition (PE) performed by non-professional post-editors. We examine both agent and user-generated content from the Customer Support domain and propose that differentiating the two is crucial to obtaining high quality translation output. Furthermore, we present results of initial experimentation with a new evaluation typology based on the Multidimensional Quality Metrics (MQM) Framework Lommel et al., 2014), specifically tailored toward the evaluation of source language text. We show how the MQM Framework Lommel et al., 2014) can be adapted to assess errors of monolingual source texts and demonstrate how very specific source errors propagate to the MT and PE targets. Finally, we illustrate how MT systems are not robust enough to handle very specific source noise in the context of Customer Support data.
Anthology ID:
2022.eamt-1.23
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Editors:
Helena Moniz, Lieve Macken, Andrew Rufener, Loïc Barrault, Marta R. Costa-jussà, Christophe Declercq, Maarit Koponen, Ellie Kemp, Spyridon Pilos, Mikel L. Forcada, Carolina Scarton, Joachim Van den Bogaert, Joke Daems, Arda Tezcan, Bram Vanroy, Margot Fonteyne
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
201–210
Language:
URL:
https://aclanthology.org/2022.eamt-1.23
DOI:
Bibkey:
Cite (ACL):
Madalena Gonçalves, Marianna Buchicchio, Craig Stewart, Helena Moniz, and Alon Lavie. 2022. Agent and User-Generated Content and its Impact on Customer Support MT. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 201–210, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
Agent and User-Generated Content and its Impact on Customer Support MT (Gonçalves et al., EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.23.pdf