The INLG 2024 Tutorial on Human Evaluation of NLP System Quality: Background, Overall Aims, and Summaries of Taught Units

Anya Belz, João Sedoc, Craig Thomson, Simon Mille, Rudali Huidrom


Abstract
Following numerous calls in the literature for improved practices and standardisation in human evaluation in Natural Language Processing over the past ten years, we held a tutorial on the topic at the 2024 INLG Conference. The tutorial addressed the structure, development, design, implementation, execution and analysis of human evaluations of NLP system quality. Hands-on practical sessions were run, designed to facilitate assimilation of the material presented. Slides, lecture recordings, code and data have been made available on GitHub (https://github.com/Human-Evaluation-Tutorial/INLG-2024-Tutorial). In this paper, we provide summaries of the content of the eight units of the tutorial, alongside its research context and aims.
Anthology ID:
2024.inlg-tutorials.1
Volume:
Proceedings of the 17th International Natural Language Generation Conference: Tutorial Abstract
Month:
September
Year:
2024
Address:
Tokyo, Japan
Editors:
Anya Belz, João Sedo, Craig Thomson, Simon Mille, Rudali Huidrom
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–12
Language:
URL:
https://aclanthology.org/2024.inlg-tutorials.1
DOI:
Bibkey:
Cite (ACL):
Anya Belz, João Sedoc, Craig Thomson, Simon Mille, and Rudali Huidrom. 2024. The INLG 2024 Tutorial on Human Evaluation of NLP System Quality: Background, Overall Aims, and Summaries of Taught Units. In Proceedings of the 17th International Natural Language Generation Conference: Tutorial Abstract, pages 1–12, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
The INLG 2024 Tutorial on Human Evaluation of NLP System Quality: Background, Overall Aims, and Summaries of Taught Units (Belz et al., INLG 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.inlg-tutorials.1.pdf