A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments

Mara Nunziatini, Andrea Alfieri


Abstract
The session will provide an overview of some of the new Machine Translation metrics available on the market, analyze if and how these new metrics correlate at a segment level to the results of Adequacy and Fluency Human Assessments, and how they compare against TER scores and Levenshtein Distance – two of our currently preferred metrics – as well as against each of the other. The information in this session will help to get a better understanding of their strengths and weaknesses and make informed decisions when it comes to forecasting MT production.
Anthology ID:
2021.mtsummit-up.29
Volume:
Proceedings of Machine Translation Summit XVIII: Users and Providers Track
Month:
August
Year:
2021
Address:
Virtual
Editors:
Janice Campbell, Ben Huyck, Stephen Larocca, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky
Venue:
MTSummit
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
440–465
Language:
URL:
https://aclanthology.org/2021.mtsummit-up.29
DOI:
Bibkey:
Cite (ACL):
Mara Nunziatini and Andrea Alfieri. 2021. A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments. In Proceedings of Machine Translation Summit XVIII: Users and Providers Track, pages 440–465, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):
A Synthesis of Human and Machine: Correlating “New” Automatic Evaluation Metrics with Human Assessments (Nunziatini & Alfieri, MTSummit 2021)
Copy Citation:
Presentation:
 2021.mtsummit-up.29.Presentation.pdf