Gaetano Cimino


2024

pdf bib
Coherence-based Dialogue Discourse Structure Extraction using Open-Source Large Language Models
Gaetano Cimino | Chuyuan Li | Giuseppe Carenini | Vincenzo Deufemia
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Despite the challenges posed by data sparsity in discourse parsing for dialogues, unsupervised methods have been underexplored. Leveraging recent advances in Large Language Models (LLMs), in this paper we investigate an unsupervised coherence-based method to build discourse structures for multi-party dialogues using open-source LLMs fine-tuned on conversational data. Specifically, we propose two algorithms that extract dialogue structures by identifying their most coherent sub-dialogues: DS-DP employs a dynamic programming strategy, while DS-FLOW applies a greedy approach. Evaluation on the STAC corpus demonstrates a micro-F1 score of 58.1%, surpassing prior unsupervised methods. Furthermore, on a cleaned subset of the Molweni corpus, the proposed method achieves a micro-F1 score of 74.7%, highlighting its effectiveness across different corpora.