Semantic Diversity for Natural Language Understanding Evaluation in Dialog Systems

Enrico Palumbo, Andrea Mezzalira, Cristina Marco, Alessandro Manzotti, Daniele Amberti


Abstract
The quality of Natural Language Understanding (NLU) models is typically evaluated using aggregated metrics on a large number of utterances. In a dialog system, though, the manual analysis of failures on specific utterances is a time-consuming and yet critical endeavor to guarantee a high-quality customer experience. A crucial question for this analysis is how to create a test set of utterances that covers a diversity of possible customer requests. In this paper, we introduce the task of generating a test set with high semantic diversity for NLU evaluation in dialog systems and we describe an approach to address it. The approach starts by extracting high-traffic utterance patterns. Then, for each pattern, it achieves high diversity selecting utterances from different regions of the utterance embedding space. We compare three selection strategies based on clustering of utterances in the embedding space, on solving the maximum distance optimization problem and on simple heuristics such as random uniform sampling and popularity. The evaluation shows that the highest semantic and lexicon diversity is obtained by a greedy maximum sum of distance solver in a comparable runtime with the clustering and the heuristics approaches.
Anthology ID:
2020.coling-industry.5
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
Month:
December
Year:
2020
Address:
Online
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
44–49
Language:
URL:
https://aclanthology.org/2020.coling-industry.5
DOI:
10.18653/v1/2020.coling-industry.5
Bibkey:
Cite (ACL):
Enrico Palumbo, Andrea Mezzalira, Cristina Marco, Alessandro Manzotti, and Daniele Amberti. 2020. Semantic Diversity for Natural Language Understanding Evaluation in Dialog Systems. In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, pages 44–49, Online. International Committee on Computational Linguistics.
Cite (Informal):
Semantic Diversity for Natural Language Understanding Evaluation in Dialog Systems (Palumbo et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-industry.5.pdf