Human semantic MT evaluation with HMEANT for IWSLT 2013

Chi-kiu Lo, Dekai Wu


Abstract
We present the results of large-scale human semantic MT evaluation with HMEANT on the IWSLT 2013 German-English MT and SLT tracks and show that HMEANT evaluates the performance of the MT systems differently compared to BLEU and TER. Together with the references, all the translations are annotated by annotators who are native English speakers in both semantic role labeling stage and role filler alignment stage of HMEANT. We obtain high inter-annotator agreement and low annotation time costs which indicate that it is feasible to run a large-scale human semantic MT evaluation campaign using HMEANT. Our results also show that HMEANT is a robust and reliable semantic MT evaluation metric for running large-scale evaluation campaigns as it is inexpensive and simple while maintaining the semantic representational transparency to provide a perspective which is different from BLEU and TER in order to understand the performance of the state-of-the-art MT systems.
Anthology ID:
2013.iwslt-evaluation.2
Volume:
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 5-6
Year:
2013
Address:
Heidelberg, Germany
Editor:
Joy Ying Zhang
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2013.iwslt-evaluation.2
DOI:
Bibkey:
Cite (ACL):
Chi-kiu Lo and Dekai Wu. 2013. Human semantic MT evaluation with HMEANT for IWSLT 2013. In Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign, Heidelberg, Germany.
Cite (Informal):
Human semantic MT evaluation with HMEANT for IWSLT 2013 (Lo & Wu, IWSLT 2013)
Copy Citation:
PDF:
https://aclanthology.org/2013.iwslt-evaluation.2.pdf