Dial-M: A Masking-based Framework for Dialogue Evaluation

Suvodip Dey, Maunendra Sankar Desarkar


Abstract
In dialogue systems, automatically evaluating machine-generated responses is critical and challenging. Despite the tremendous progress in dialogue generation research, its evaluation heavily depends on human judgments. The standard word-overlapping based evaluation metrics are ineffective for dialogues. As a result, most of the recently proposed metrics are model-based and reference-free, which learn to score different aspects of a conversation. However, understanding each aspect requires a separate model, which makes them computationally expensive. To this end, we propose Dial-M, a Masking-based reference-free framework for Dialogue evaluation. The main idea is to mask the keywords of the current utterance and predict them, given the dialogue history and various conditions (like knowledge, persona, etc.), thereby making the evaluation framework simple and easily extensible for multiple datasets. Regardless of its simplicity, Dial-M achieves comparable performance to state-of-the-art metrics on several dialogue evaluation datasets. We also discuss the interpretability of our proposed metric along with error analysis.
Anthology ID:
2023.sigdial-1.7
Volume:
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2023
Address:
Prague, Czechia
Editors:
Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
77–84
Language:
URL:
https://aclanthology.org/2023.sigdial-1.7
DOI:
10.18653/v1/2023.sigdial-1.7
Bibkey:
Cite (ACL):
Suvodip Dey and Maunendra Sankar Desarkar. 2023. Dial-M: A Masking-based Framework for Dialogue Evaluation. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 77–84, Prague, Czechia. Association for Computational Linguistics.
Cite (Informal):
Dial-M: A Masking-based Framework for Dialogue Evaluation (Dey & Desarkar, SIGDIAL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigdial-1.7.pdf