A Statistical Analysis of Automated MT Evaluation Metrics for Assessments in Task-Based MT Evaluation

Calandra R. Tate


Abstract
This paper applies nonparametric statistical techniques to Machine Translation (MT) Evaluation using data from a large scale task-based study. In particular, the relationship between human task performance on an information extraction task with translated documents and well-known automated translation evaluation metric scores for those documents is studied. Findings from a correlation analysis of this connection are presented and contrasted with current strategies for evaluating translations. An extended analysis that involves a novel idea for assessing partial rank correlation within the presence of grouping factors is also discussed. This work exposes the limitations of descriptive statistics generally used in this area, mainly correlation analysis, when using automated metrics for assessments in task handling purposes.
Anthology ID:
2008.amta-papers.17
Volume:
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:
October 21-25
Year:
2008
Address:
Waikiki, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
182–191
Language:
URL:
https://aclanthology.org/2008.amta-papers.17
DOI:
Bibkey:
Cite (ACL):
Calandra R. Tate. 2008. A Statistical Analysis of Automated MT Evaluation Metrics for Assessments in Task-Based MT Evaluation. In Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, pages 182–191, Waikiki, USA. Association for Machine Translation in the Americas.
Cite (Informal):
A Statistical Analysis of Automated MT Evaluation Metrics for Assessments in Task-Based MT Evaluation (Tate, AMTA 2008)
Copy Citation:
PDF:
https://aclanthology.org/2008.amta-papers.17.pdf