A Tutorial on Evaluation Metrics used in Natural Language Generation

Mitesh M. Khapra, Ananya B. Sai


Abstract
The advent of Deep Learning and the availability of large scale datasets has accelerated research on Natural Language Generation with a focus on newer tasks and better models. With such rapid progress, it is vital to assess the extent of scientific progress made and identify the areas/components that need improvement. To accomplish this in an automatic and reliable manner, the NLP community has actively pursued the development of automatic evaluation metrics. Especially in the last few years, there has been an increasing focus on evaluation metrics, with several criticisms of existing metrics and proposals for several new metrics. This tutorial presents the evolution of automatic evaluation metrics to their current state along with the emerging trends in this field by specifically addressing the following questions: (i) What makes NLG evaluation challenging? (ii) Why do we need automatic evaluation metrics? (iii) What are the existing automatic evaluation metrics and how can they be organised in a coherent taxonomy? (iv) What are the criticisms and shortcomings of existing metrics? (v) What are the possible future directions of research?
Anthology ID:
2021.naacl-tutorials.4
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials
Month:
June
Year:
2021
Address:
Online
Editors:
Greg Kondrak, Kalina Bontcheva, Dan Gillick
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15–19
Language:
URL:
https://aclanthology.org/2021.naacl-tutorials.4
DOI:
10.18653/v1/2021.naacl-tutorials.4
Bibkey:
Cite (ACL):
Mitesh M. Khapra and Ananya B. Sai. 2021. A Tutorial on Evaluation Metrics used in Natural Language Generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials, pages 15–19, Online. Association for Computational Linguistics.
Cite (Informal):
A Tutorial on Evaluation Metrics used in Natural Language Generation (Khapra & Sai, NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-tutorials.4.pdf
Video:
 https://aclanthology.org/2021.naacl-tutorials.4.mp4