FEMTI: creating and using a framework for MT evaluation

Margaret King, Andrei Popescu-Belis, Eduard Hovy


Abstract
This paper presents FEMTI, a web-based Framework for the Evaluation of Machine Translation in ISLE. FEMTI offers structured descriptions of potential user needs, linked to an overview of technical characteristics of MT systems. The description of possible systems is mainly articulated around the quality characteristics for software product set out in ISO/IEC standard 9126. Following the philosophy set out there and in the related 14598 series of standards, each quality characteristic bottoms out in metrics which may be applied to a particular instance of a system in order to judge how satisfactory the system is with respect to that characteristic. An evaluator can use the description of user needs to help identify the specific needs of his evaluation and the relations between them. He can then follow the pointers to system description to determine what metrics should be applied and how. In the current state of the framework, emphasis is on being exhaustive, including as much as possible of the information available in the literature on machine translation evaluation. Future work will aim at being more analytic, looking at characteristics and metrics to see how they relate to one another, validating metrics and investigating the correlation between particular metrics and human judgement.
Anthology ID:
2003.mtsummit-papers.30
Volume:
Proceedings of Machine Translation Summit IX: Papers
Month:
September 23-27
Year:
2003
Address:
New Orleans, USA
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2003.mtsummit-papers.30
DOI:
Bibkey:
Cite (ACL):
Margaret King, Andrei Popescu-Belis, and Eduard Hovy. 2003. FEMTI: creating and using a framework for MT evaluation. In Proceedings of Machine Translation Summit IX: Papers, New Orleans, USA.
Cite (Informal):
FEMTI: creating and using a framework for MT evaluation (King et al., MTSummit 2003)
Copy Citation:
PDF:
https://aclanthology.org/2003.mtsummit-papers.30.pdf