A proof of concept on triangular test evaluation for Natural Language Generation

Javier González Corbelle, José María Alonso Moral, Alberto Bugarín Diz


Abstract
The evaluation of Natural Language Generation (NLG) systems has recently aroused much interest in the research community, since it should address several challenging aspects, such as readability of the generated texts, adequacy to the user within a particular context and moment and linguistic quality-related issues (e.g., correctness, coherence, understandability), among others. In this paper, we propose a novel technique for evaluating NLG systems that is inspired on the triangular test used in the field of sensory analysis. This technique allows us to compare two texts generated by different subjects and to i) determine whether statistically significant differences are detected between them when evaluated by humans and ii) quantify to what extent the number of evaluators plays an important role in the sensitivity of the results. As a proof of concept, we apply this evaluation technique in a real use case in the field of meteorology, showing the advantages and disadvantages of our proposal.
Anthology ID:
2020.evalnlgeval-1.1
Volume:
Proceedings of the 1st Workshop on Evaluating NLG Evaluation
Month:
December
Year:
2020
Address:
Online (Dublin, Ireland)
Editors:
Shubham Agarwal, Ondřej Dušek, Sebastian Gehrmann, Dimitra Gkatzia, Ioannis Konstas, Emiel Van Miltenburg, Sashank Santhanam
Venue:
EvalNLGEval
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–9
Language:
URL:
https://aclanthology.org/2020.evalnlgeval-1.1
DOI:
Bibkey:
Cite (ACL):
Javier González Corbelle, José María Alonso Moral, and Alberto Bugarín Diz. 2020. A proof of concept on triangular test evaluation for Natural Language Generation. In Proceedings of the 1st Workshop on Evaluating NLG Evaluation, pages 1–9, Online (Dublin, Ireland). Association for Computational Linguistics.
Cite (Informal):
A proof of concept on triangular test evaluation for Natural Language Generation (Corbelle et al., EvalNLGEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.evalnlgeval-1.1.pdf