Few Shades of Supervision for Discourse Segmentation

Laurent Prévot; Philippe Muller

doi:10.5210/dad.2025.202

Few Shades of Supervision for Discourse Segmentation

Abstract

Elementary Discourse Units (EDUs) constitutes the interface between language grammar and lan- guage use. On the one hand, they result from compositional semantic processes that combines individual word meanings into proposition-level representations. On the other hand, EDUs form the building blocks of most text, discourse, and dialogue frameworks. In written genres, where punctuation is available and reliable, segmenting EDUs is sometimes seen as a nearly solved problem, as least for high-resource languages. However, this is not the case for spontaneous speech transcripts. In this paper, we use a significant (8-hour) French corpus, manually segmented into EDUs, to evaluate several large language model (LLM)-based approaches for this task. We compare various fine-tuning strategies, including those relying on weakly supervised labels, in relation to the amount of ”gold” manual annotations that can be available. We also experiment with in-context learning, where example instances are provided to condition a generative model (few-shots learning) or in a purely generative approach (zero-shot). Our findings indicate that classical fine-tuning is still the most effective approach, requiring only a reasonable amount of gold-annotated data to achieve the best performance in our experiments. Beyond traditional quantitative evaluation, we conducted a systematic qualitative analysis, identifying directions for further improvement. These include integrating prosodic considerations while handling pauses when they co-occur with disfluencies or complex discourse markers uses. Finally, we argue for the significance of this task and the resulting units, compared to acoustic and syntactic proxies, especially for quantitative linguistics focusing on spontaneous speech.

Anthology ID:: 2025.dnd-16.13
Volume:: Dialogue Discourse Volume 16
Month:: December
Year:: 2025
Address:: Chicago, Illinois, USA
Editors:: Amir Zeldes, Manfred Stede, Patrick G.T. Healey, and Hendrik Buschmeier
Venue:: DND
SIG:: SIGDIAL
Publisher:: University of Illinois Chicago
Note:
Pages:: 35–73
Language:
URL:: https://aclanthology.org/2025.dnd-16.13/
DOI:: 10.5210/dad.2025.202
Bibkey:
Cite (ACL):: Laurent Prevot and Philippe Muller. 2025. Few Shades of Supervision for Discourse Segmentation. Dialogue & Discourse, 16:35–73.
Cite (Informal):: Few Shades of Supervision for Discourse Segmentation (Prevot & Muller, DND 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.dnd-16.13.pdf

PDF Cite Search Fix data