Are You a Good Assistant? Assessing LLM Trustability in Task-oriented Dialogues

Tiziano Labruna, Sofia Brenna, Giovanni Bonetta, Bernardo Magnini


Abstract
Despite the impressive capabilities of recent Large Language Models (LLMs) to generate human-like text, their ability to produce contextually appropriate content for specific communicative situations is still a matter of debate. This issue is particularly crucial when LLMs are employed as assistants to help solve tasks or achieve goals within a given conversational domain. In such scenarios, the assistant is expected to access specific knowledge (e.g., a database of restaurants, a calendar of appointments) that is not directly accessible to the user and must be consistently utilised to accomplish the task.In this paper, we conduct experiments to evaluate the trustworthiness of automatic assistants in task-oriented dialogues. Our findings indicate that state-of-the-art open-source LLMs still face significant challenges in maintaining logical consistency with a knowledge base of facts, highlighting the need for further advancements in this area.
Anthology ID:
2024.clicit-1.56
Volume:
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:
December
Year:
2024
Address:
Pisa, Italy
Editors:
Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:
CLiC-it
SIG:
Publisher:
CEUR Workshop Proceedings
Note:
Pages:
470–477
Language:
URL:
https://aclanthology.org/2024.clicit-1.56/
DOI:
Bibkey:
Cite (ACL):
Tiziano Labruna, Sofia Brenna, Giovanni Bonetta, and Bernardo Magnini. 2024. Are You a Good Assistant? Assessing LLM Trustability in Task-oriented Dialogues. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 470–477, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):
Are You a Good Assistant? Assessing LLM Trustability in Task-oriented Dialogues (Labruna et al., CLiC-it 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clicit-1.56.pdf