Julie Hunter


pdf bib
Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus
Laurent Prevot | Julie Hunter | Philippe Muller
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)

While discourse parsing has made considerable progress in recent years, discourse segmentation of conversational speech remains a difficult issue. In this paper, we exploit a French data set that has been manually segmented into discourse units to compare two approaches to discourse segmentation: fine-tuning existing systems on manual segmentation vs. using hand-crafted labelling rules to develop a weakly supervised segmenter. Our results show that both approaches yield similar performance in terms of f-score while data programming requires less manual annotation work. In a second experiment we play with the amount of training data used for fine-tuning systems and show that a small amount of hand labelled data is enough to obtain good results (although significantly lower than in the first experiment using all the annotated data available).

pdf bib
A simple but effective model for attachment in discourse parsing with multi-task learning for relation labeling
Zineb Bennis | Julie Hunter | Nicholas Asher
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

In this paper, we present a discourse parsing model for conversation trained on the STAC. We fine-tune a BERT-based model to encode pairs of discourse units and use a simple linear layer to predict discourse attachments. We then exploit a multi-task setting to predict relation labels. The multitask approach effectively aids in the difficult task of relation type prediction; our f1 score of 57 surpasses the state of the art with no loss in performance for attachment, confirming the intuitive interdependence of these two tasks. Our method also improves over previous discourse parsing models in allowing longer input sizes and in permitting attachments in which one node has multiple parents, an important feature of multiparty conversation.

pdf bib
Limits for learning with language models
Nicholas Asher | Swarnadeep Bhar | Akshay Chaturvedi | Julie Hunter | Soumya Paul
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

With the advent of large language models (LLMs), the trend in NLP has been to train LLMs on vast amounts of data to solve diverse language understanding and generation tasks. The list of LLM successes is long and varied. Nevertheless, several recent papers provide empirical evidence that LLMs fail to capture important aspects of linguistic meaning. Focusing on universal quantification, we provide a theoretical foundation for these empirical findings by proving that LLMs cannot learn certain fundamental semantic properties including semantic entailment and consistency as they are defined in formal semantics. More generally, we show that LLMs are unable to learn concepts beyond the first level of the Borel Hierarchy, which imposes severe limits on the ability of LMs, both large and small, to capture many aspects of linguistic meaning. This means that LLMs will operate without formal guarantees on tasks that require entailments and deep linguistic understanding.


pdf bib
Weakly supervised discourse segmentation for multiparty oral conversations
Lila Gravellier | Julie Hunter | Philippe Muller | Thomas Pellegrini | Isabelle Ferrané
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Discourse segmentation, the first step of discourse analysis, has been shown to improve results for text summarization, translation and other NLP tasks. While segmentation models for written text tend to perform well, they are not directly applicable to spontaneous, oral conversation, which has linguistic features foreign to written text. Segmentation is less studied for this type of language, where annotated data is scarce, and existing corpora more heterogeneous. We develop a weak supervision approach to adapt, using minimal annotation, a state of the art discourse segmenter trained on written text to French conversation transcripts. Supervision is given by a latent model bootstrapped by manually defined heuristic rules that use linguistic and acoustic information. The resulting model improves the original segmenter, especially in contexts where information on speaker turns is lacking or noisy, gaining up to 13% in F-score. Evaluation is performed on data like those used to define our heuristic rules, but also on transcripts from two other corpora.


pdf bib
Proceedings of the IWCS workshop on Foundations of Situated and Multimodal Communication
Nicholas Asher | Julie Hunter | Alex Lascarides
Proceedings of the IWCS workshop on Foundations of Situated and Multimodal Communication


pdf bib
Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Nicholas Asher | Julie Hunter | Mathieu Morey | Benamara Farah | Stergos Afantenos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes the STAC resource, a corpus of multi-party chats annotated for discourse structure in the style of SDRT (Asher and Lascarides, 2003; Lascarides and Asher, 2009). The main goal of the STAC project is to study the discourse structure of multi-party dialogues in order to understand the linguistic strategies adopted by interlocutors to achieve their conversational goals, especially when these goals are opposed. The STAC corpus is not only a rich source of data on strategic conversation, but also the first corpus that we are aware of that provides full discourse structures for multi-party dialogues. It has other remarkable features that make it an interesting resource for other topics: interleaved threads, creative language, and interactions between linguistic and extra-linguistic contexts.


pdf bib
Integrating Non-Linguistic Events into Discourse Structure
Julie Hunter | Nicholas Asher | Alex Lascarides
Proceedings of the 11th International Conference on Computational Semantics


pdf bib
Because We Say So
Julie Hunter | Laurence Danlos
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)