A fluency error categorization scheme to guide automated machine translation evaluation

Debbie Elliott, Anthony Hartley, Eric Atwell


Abstract
Existing automated MT evaluation methods often require expert human translations. These are produced for every language pair evaluated and, due to this expense, subsequent evaluations tend to rely on the same texts, which do not necessarily reflect real MT use. In contrast, we are designing an automated MT evaluation system, intended for use by post-editors, purchasers and developers, that requires nothing but the raw MT output. Furthermore, our research is based on texts that reflect corporate use of MT. This paper describes our first step in system design: a hierarchical classification scheme of fluency errors in English MT output, to enable us to identify error types and frequencies, and guide the selection of errors for automated detection. We present results from the statistical analysis of 20,000 words of MT output, manually annotated using our classification scheme, and describe correlations between error frequencies and human scores for fluency and adequacy.
Anthology ID:
2004.amta-papers.8
Volume:
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
September 28 - October 2
Year:
2004
Address:
Washington, USA
Editors:
Robert E. Frederking, Kathryn B. Taylor
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
64–73
Language:
URL:
https://link.springer.com/chapter/10.1007/978-3-540-30194-3_8
DOI:
Bibkey:
Cite (ACL):
Debbie Elliott, Anthony Hartley, and Eric Atwell. 2004. A fluency error categorization scheme to guide automated machine translation evaluation. In Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 64–73, Washington, USA. Springer.
Cite (Informal):
A fluency error categorization scheme to guide automated machine translation evaluation (Elliott et al., AMTA 2004)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/978-3-540-30194-3_8