Insights from Gathering MT Productivity Metrics at Scale

Georg Kirchner


Abstract
In this paper, we describe Dell EMC’s framework to automatically collect MT-related productivity metrics from a large translation supply chain over an extended period of time, the characteristics and volume of the gathered data, and the insights from analyzing the data to guide our MT strategy. Aligning tools, processes and people required decisions, concessions and contributions from Dell management, technology providers, tool implementors, LSPs and linguists to harvest data at scale over 2+ years while Dell EMC migrated from customized SMT to generic NMT and then customized NMT systems. For content in two quality tiers, we ranked language pairs by productivity, graphed trendlines, compared the time needed to edit machine translations versus fuzzy matches, studied the time spent on segments with no post-edits, and going by the post-edit density, re-viewed segment distribution on a post-edit scale of 1 to 10 and any correlation between the extent of edits and segment length.
Anthology ID:
2020.eamt-1.38
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Editors:
André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
353–362
Language:
URL:
https://aclanthology.org/2020.eamt-1.38
DOI:
Bibkey:
Cite (ACL):
Georg Kirchner. 2020. Insights from Gathering MT Productivity Metrics at Scale. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 353–362, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Insights from Gathering MT Productivity Metrics at Scale (Kirchner, EAMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.eamt-1.38.pdf