Philippe Muller - ACL Anthology

Philippe Muller

2026

Minimal Clips, Maximum Salience: Long Video Summarization via Key Moment Extraction
Galann Pennec | Zhengyuan Liu | Nicholas Asher | Philippe Muller | Nancy Chen
Proceedings of the 16th International Workshop on Spoken Dialogue System Technology

Vision-Language Models (VLMs) are able to process increasingly longer videos. Yet, important visual information is easily lost throughout the entire context and missed by VLMs. Also, it is important to design tools that enable cost-effective analysis of lengthy video content. In this paper, we propose a clip selection method that targets key video moments to be included in a multimodal summary. We divide the video into short clips and generate compact visual descriptions of each using a lightweight video captioning model. These are then passed to a large language model (LLM), which selects the K clips containing the most relevant visual information for a multimodal summary. We evaluate our approach on reference clips for the task, automatically derived from full human-annotated screenplays and summaries in the MovieSum dataset. We further show that these reference clips (less than 6% of the movie) are sufficient to build a complete multimodal summary of the movies in MovieSum. Using our clip selection method, we achieve a summarization performance close to that of these reference clips while capturing substantially more relevant video information than random clip selection. Importantly, we maintain low computational cost by relying on a lightweight captioning model.

2025

Supervision faible pour la classification des relations discursives
Khalil Maachou | Chloé Braud | Philippe Muller
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : articles scientifiques originaux

L’identification des relations discursives est importante pour comprendre les liens sémantiques qui structurent un texte, mais cette tâche souffre d’un manque de données qui limite les performances. D’un autre côté, de nombreux corpus discursifs existent : les divergences entre les projets d’annotation empêchent cependant de combiner directement ces jeux de données à l’entraînement. Nous proposons de résoudre ce problème en exploitant le cadre de la supervision faible, dont l’objectif est de générer des annotations à partir de sources variées, comme des heuristiques ou des modèles pré-entraînés. Ces annotations bruitées et partielles sont ensuite combinées pour entraîner un modèle sur la tâche. En combinant cette méthode avec des stratégies permettant de gérer les différences dans les jeux d’étiquettes, nous démontrons qu’il est possible d’obtenir des performances proches d’un système entièrement supervisé en s’appuyant sur une très petite partie des données d’origine, ouvrant ainsi des perspectives d’amélioration pour des domaines ou des langages à faibles ressources.

Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and Evaluation
Galann Pennec | Zhengyuan Liu | Nicholas Asher | Philippe Muller | Nancy F. Chen
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Vision-Language Models (VLMs) often struggle to balance visual and textual information when summarizing complex multimodal inputs, such as entire TV show episodes. In this paper, we propose a zero-shot video-to-text summarization approach that builds its own screenplay-like representation of an episode, effectively integrating key video moments, dialogue, and character information into a unified document. Unlike previous approaches, we simultaneously generate screenplays and name the characters in zero-shot, using only the audio, video, and transcripts as input. Additionally, we highlight that existing summarization metrics can fail to assess the multimodal content in summaries. To address this, we introduce MFactSum, a multimodal metric that evaluates summaries with respect to both vision and text modalities. Using MFactSum, we evaluate our screenplay summaries on the SummScreen3D dataset, demonstrating superiority against state-of-the-art VLMs such as Gemini 1.5 by generating summaries containing 20% more relevant visual information while requiring 75% less of the video as input.

Few Shades of Supervision for Discourse Segmentation
Laurent Prevot | Philippe Muller
Dialogue & Discourse Volume 16

Elementary Discourse Units (EDUs) constitutes the interface between language grammar and lan- guage use. On the one hand, they result from compositional semantic processes that combines individual word meanings into proposition-level representations. On the other hand, EDUs form the building blocks of most text, discourse, and dialogue frameworks. In written genres, where punctuation is available and reliable, segmenting EDUs is sometimes seen as a nearly solved problem, as least for high-resource languages. However, this is not the case for spontaneous speech transcripts. In this paper, we use a significant (8-hour) French corpus, manually segmented into EDUs, to evaluate several large language model (LLM)-based approaches for this task. We compare various fine-tuning strategies, including those relying on weakly supervised labels, in relation to the amount of ”gold” manual annotations that can be available. We also experiment with in-context learning, where example instances are provided to condition a generative model (few-shots learning) or in a purely generative approach (zero-shot). Our findings indicate that classical fine-tuning is still the most effective approach, requiring only a reasonable amount of gold-annotated data to achieve the best performance in our experiments. Beyond traditional quantitative evaluation, we conducted a systematic qualitative analysis, identifying directions for further improvement. These include integrating prosodic considerations while handling pauses when they co-occur with disfluencies or complex discourse markers uses. Finally, we argue for the significance of this task and the resulting units, compared to acoustic and syntactic proxies, especially for quantitative linguistics focusing on spontaneous speech.

DisCuT and DiscReT: MELODI at DISRPT 2025 Multilingual discourse segmentation, connective tagging and relation classification
Robin Pujol | Firmin Rousseau | Philippe Muller | Chloé Braud
Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)

This paper presents the results obtained by the MELODI team for the three tasks proposed within the DISRPT 2025 shared task on discourse: segmentation, connective identification, and relation classification. The competition involves corpora in various languages, in several underlying frameworks, and datasets are given with or without sentence segmentation. This year, for the ranked, closed track, the campaign adds as a constraint to train only one model for each task, with an upper bound on the size of the model (no more than 4B parameters).An additional open track authorizes any size of, possibly non public, models that will not be reproduced by the organizers and thus not ranked.We compared several fine-tuning approaches either based on encoder-only transformer-based models, or auto-regressive generative ones. To be able to train one model on the variety of corpora, we explored various ways of combining data – by framework, language or language groups, with different sequential orderings –, and the addition of features to guide the model. For the closed track, our final submitted system is based on XLM-RoBERTa large for relation identification, and on InfoXLM for segmentation and connective identification. Our experiments demonstrate that building a single, multilingual model does not necessarily degrade the performance compared to language-specific systems, with at best 64.06% for relation identification, 90.19% for segmentation and 81.15% for connective identification (on average on the development sets), results that are similar or higher that the ones obtained in previous campaigns.We also found that a generative approach could give even higher results on relation identification, with at best 64.65% on the dev sets.

The DISRPT 2025 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification
Chloé Braud | Amir Zeldes | Chuyuan Li | Yang Janet Liu | Philippe Muller
Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)

In 2025, we held the fourth iteration of the DISRPT Shared Task (Discourse Relation Parsing and Treebanking) dedicated to discourse parsing across formalisms. Following the success of the 2019, 2021, and 2023 tasks on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification, this iteration added 13 new datasets, including three new languages (Czech, Polish, Nigerian Pidgin) and two new frameworks: the ISO framework and Enhanced Rhetorical Structure Theory, in addition to the previously included frameworks: RST, SDRT, DEP, and PDTB. In this paper, we review the data included in DISRPT 2025, which covers 39 datasets across 16 languages, survey and compare submitted systems, and report on system performance on each task for both treebanked and plain-tokenized versions of the data. The best systems obtain a mean accuracy of 71.19% for relation classification, a mean F1 of 91.57 (Treebanked Track) and 87.38 (Plain Track) for segmentation, and a mean F1 of 81.53 (Treebanked Track) and 79.92 (Plain Track) for connective identification. The data and trained models of several participants can be found at https://huggingface.co/multilingual-discourse-hub.

Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)
Chloé Braud | Yang Janet Liu | Philippe Muller | Amir Zeldes | Chuyuan Li
Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)

2024

DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing
Chloé Braud | Amir Zeldes | Laura Rivière | Yang Janet Liu | Philippe Muller | Damien Sileo | Tatsuya Aoyama
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper presents DISRPT, a multilingual, multi-domain, and cross-framework benchmark dataset for discourse processing, covering the tasks of discourse unit segmentation, connective identification, and relation classification. DISRPT includes 13 languages, with data from 24 corpora covering about 4 millions tokens and around 250,000 discourse relation instances from 4 discourse frameworks: RST, SDRT, PDTB, and Discourse Dependencies. We present an overview of the data, its development across three NLP shared tasks on discourse processing carried out in the past five years, and the latest modifications and added extensions. We also carry out an evaluation of state-of-the-art multilingual systems trained on the data for each task, showing plateau performance on segmentation, but important room for improvement for connective identification and relation classification. The DISRPT benchmark employs a unified format that we make available on GitHub and HuggingFace in order to encourage future work on discourse processing across languages, domains, and frameworks.

Zero-shot Learning for Multilingual Discourse Relation Classification
Eleni Metheniti | Philippe Muller | Chloé Braud | Margarita Hernández Casas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Classifying discourse relations is known as a hard task, relying on complex indices. On the other hand, discourse-annotated data is scarce, especially for languages other than English: many corpora, of limited size, exist for several languages but the domain is split between different theoretical frameworks that have a huge impact on the nature of the textual spans to be linked, and the label set used. Moreover, each annotation project implements modifications compared to the theoretical background and other projects. These discrepancies hinder the development of systems taking advantage of all the available data to tackle data sparsity and work on transfer between languages is very limited, almost nonexistent between frameworks, while it could improve our understanding of some theoretical aspects and enhance many applications. In this paper, we propose the first experiments on zero-shot learning for discourse relation classification and investigate several paths in the way source data can be combined, either based on languages, frameworks, or similarity measures. We demonstrate how difficult transfer is for the task at hand, and that the most impactful factor is label set divergence, where the notion of underlying framework possibly conceals crucial disagreements.

In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models
Ayrton San Joaquin | Bin Wang | Zhengyuan Liu | Nicholas Asher | Brian Lim | Philippe Muller | Nancy F. Chen
Findings of the Association for Computational Linguistics: EMNLP 2024

Despite advancements, fine-tuning Large Language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remains a barrier for the open-source community. To address this challenge, we propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and evaluation samples with a trained model. Notably, we assess the model’s internal gradients to estimate this relationship, aiming to rank the contribution of each training point. To enhance efficiency, we propose an optimization to compute influence functions with a reduced number of layers while achieving similar accuracy. By applying our algorithm to instruction fine-tuning data of LLMs, we can achieve similar performance with just 50% of the training data. Meantime, using influence functions to analyze model coverage to certain testing samples could provide a reliable and interpretable signal on the training set’s coverage of those test points.

Feature-augmented model for multilingual discourse relation classification
Eleni Metheniti | Chloé Braud | Philippe Muller
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)

Discourse relation classification within a multilingual, cross-framework setting is a challenging task, and the best-performing systems so far have relied on monolingual and mono-framework approaches.In this paper, we introduce transformer-based multilingual models, trained jointly over all datasets—thus covering different languages and discourse frameworks. We demonstrate their ability to outperform single-corpus models and to overcome (to some extent) the disparity among corpora, by relying on linguistic features and generic information about the nature of the datasets. We also compare the performance of different multilingual pretrained models, as well as the encoding of the relation direction, a key component for the task. Our results on the 16 datasets of the DISRPT 2021 benchmark show improvements in accuracy in (almost) all datasets compared to the monolingual models, with at best 65.91% in average accuracy, thus corresponding to a 4% improvement over the state-of-the-art.

Complex question generation using discourse-based data augmentation
Khushnur Jahangir | Philippe Muller | Chloé Braud
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)

Question Generation (QG), the process of generating meaningful questions from a given context, has proven to be useful for several tasks such as question answering or FAQ generation. While most existing QG techniques generate simple, fact-based questions, this research aims to generate questions that can have complex answers (e.g. “why” questions). We propose a data augmentation method that uses discourse relations to create such questions, and experiment on existing English data. Our approach generates questions based solely on the context without answer supervision, in order to enhance question diversity and complexity. We use an encoder-decoder trained on the augmented dataset to generate either one question or multiple questions at a time, and show that the latter improves over the baseline model when doing a human quality evaluation, without degrading performance according to standard automated metrics.

2023

MELODI at SemEval-2023 Task 3: In-domain Pre-training for Low-resource Classification of News Articles
Nicolas Devatine | Philippe Muller | Chloé Braud
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our approach to Subtask 1 “News Genre Categorization” of SemEval-2023 Task 3 “Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup”, which aims to determine whether a given news article is an opinion piece, an objective report, or satirical. We fine-tuned the domain-specific language model POLITICS, which was pre-trained on a large-scale dataset of more than 3.6M English political news articles following ideology-driven pre-training objectives. In order to use it in the multilingual setup of the task, we added as a pre-processing step the translation of all documents into English. Our system ranked among the top systems overall in most language, and ranked 1st on the English dataset.

Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus
Laurent Prevot | Julie Hunter | Philippe Muller
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)

While discourse parsing has made considerable progress in recent years, discourse segmentation of conversational speech remains a difficult issue. In this paper, we exploit a French data set that has been manually segmented into discourse units to compare two approaches to discourse segmentation: fine-tuning existing systems on manual segmentation vs. using hand-crafted labelling rules to develop a weakly supervised segmenter. Our results show that both approaches yield similar performance in terms of f-score while data programming requires less manual annotation work. In a second experiment we play with the amount of training data used for fine-tuning systems and show that a small amount of hand labelled data is enough to obtain good results (although significantly lower than in the first experiment using all the annotated data available).

An Integrated Approach for Political Bias Prediction and Explanation Based on Discursive Structure
Nicolas Devatine | Philippe Muller | Chloé Braud
Findings of the Association for Computational Linguistics: ACL 2023

One crucial aspect of democracy is fair information sharing. While it is hard to prevent biases in news, they should be identified for better transparency. We propose an approach to automatically characterize biases that takes into account structural differences and that is efficient for long texts. This yields new ways to provide explanations for a textual classifier, going beyond mere lexical cues. We show that: (i) the use of discourse-based structure-aware document representations compare well to local, computationally heavy, or domain-specific models on classification tasks that deal with textual bias (ii) our approach based on different levels of granularity allows for the generation of better explanations of model decisions, both at the lexical and structural level, while addressing the challenge posed by long texts.

DisCut and DiscReT: MELODI at DISRPT 2023
Eleni Metheniti | Chloé Braud | Philippe Muller | Laura Rivière
Proceedings of the 3rd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2023)

This paper presents the results obtained by the MELODI team for the three tasks proposed within the DISRPT 2023 shared task on discourse: segmentation, connective identification, and relation classification. The competition involves corpora in various languages in several underlying frameworks, and proposes two tracks depending on the presence or not of annotations of sentence boundaries and syntactic information. For these three tasks, we rely on a transformer-based architecture, and investigate several optimizations of the models, including hyper-parameter search and layer freezing. For discourse relations, we also explore the use of adapters—a lightweight solution for model fine-tuning—and introduce relation mappings to partially deal with the label set explosion we are facing within the setting of the shared task in a multi-corpus perspective. In the end, we propose one single architecture for segmentation and connectives, based on XLM-RoBERTa large, freezed at lower layers, with new state-of-the-art results for segmentation, and we propose 3 different models for relations, since the task makes it harder to generalize across all corpora.

The DISRPT 2023 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification
Chloé Braud | Yang Janet Liu | Eleni Metheniti | Philippe Muller | Laura Rivière | Attapol Rutherford | Amir Zeldes
Proceedings of the 3rd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2023)

In 2023, the third iteration of the DISRPT Shared Task (Discourse Relation Parsing and Treebanking) was held, dedicated to the underlying units used in discourse parsing across formalisms. Following the success of the 2019and 2021 tasks on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification, this iteration has added 10 new corpora, including 2 new languages (Thai and Italian) and 3 discourse treebanks annotated in the discourse dependency representation in addition to the previously included frameworks: RST, SDRT, and PDTB. In this paper, we review the data included in the Shared Task, which covers 26 datasets across 13 languages, survey and compare submitted systems, and report on system performance on each task for both annotated and plain-tokenized versions of the data.

Proceedings of the 3rd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2023)
Chloé Braud | Yang Janet Liu | Eleni Metheniti | Philippe Muller | Laura Rivière | Attapol Rutherford | Amir Zeldes
Proceedings of the 3rd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2023)

2022

A Pragmatics-Centered Evaluation Framework for Natural Language Understanding
Damien Sileo | Philippe Muller | Tim Van de Cruys | Camille Pradel
Proceedings of the Thirteenth Language Resources and Evaluation Conference

New models for natural language understanding have recently made an unparalleled amount of progress, which has led some researchers to suggest that the models induce universal text representations. However, current benchmarks are predominantly targeting semantic phenomena; we make the case that pragmatics needs to take center stage in the evaluation of natural language understanding. We introduce PragmEval, a new benchmark for the evaluation of natural language understanding, that unites 11 pragmatics-focused evaluation datasets for English. PragmEval can be used as supplementary training data in a multi-task learning setup, and is publicly available, alongside the code for gathering and preprocessing the datasets. Using our evaluation suite, we show that natural language inference, a widely used pretraining task, does not result in genuinely universal representations, which presents a new challenge for multi-task learning.

Predicting Political Orientation in News with Latent Discourse Structure to Improve Bias Understanding
Nicolas Devatine | Philippe Muller | Chloé Braud
Proceedings of the 3rd Workshop on Computational Approaches to Discourse

With the growing number of information sources, the problem of media bias becomes worrying for a democratic society. This paper explores the task of predicting the political orientation of news articles, with a goal of analyzing how bias is expressed. We demonstrate that integrating rhetorical dimensions via latent structures over sub-sentential discourse units allows for large improvements, with a +7.4 points difference between the base LSTM model and its discourse-based version, and +3 points improvement over the previous BERT-based state-of-the-art model. We also argue that this gives a new relevant handle for analyzing political bias in news articles.

2021

Plongements Interprétables pour la Détection de Biais Cachés (Interpretable Embeddings for Hidden Biases Detection)
Tom Bourgeade | Philippe Muller | Tim Van de Cruys
Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale

De nombreuses tâches sémantiques en TAL font usage de données collectées de manière semiautomatique, ce qui est souvent source d’artefacts indésirables qui peuvent affecter négativement les modèles entraînés sur celles-ci. Avec l’évolution plus récente vers des modèles à usage générique pré-entraînés plus complexes, et moins interprétables, ces biais peuvent conduire à l’intégration de corrélations indésirables dans des applications utilisateurs. Récemment, quelques méthodes ont été proposées pour entraîner des plongements de mots avec une meilleure interprétabilité. Nous proposons une méthode simple qui exploite ces représentations pour détecter de manière préventive des corrélations lexicales faciles à apprendre, dans divers jeux de données. Nous évaluons à cette fin quelques modèles de plongements interprétables populaires pour l’anglais, en utilisant à la fois une évaluation intrinsèque, et un ensemble de tâches sémantiques en aval, et nous utilisons la qualité interprétable des plongements afin de diagnostiquer des biais potentiels dans les jeux de données associés.

Weakly supervised discourse segmentation for multiparty oral conversations
Lila Gravellier | Julie Hunter | Philippe Muller | Thomas Pellegrini | Isabelle Ferrané
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Discourse segmentation, the first step of discourse analysis, has been shown to improve results for text summarization, translation and other NLP tasks. While segmentation models for written text tend to perform well, they are not directly applicable to spontaneous, oral conversation, which has linguistic features foreign to written text. Segmentation is less studied for this type of language, where annotated data is scarce, and existing corpora more heterogeneous. We develop a weak supervision approach to adapt, using minimal annotation, a state of the art discourse segmenter trained on written text to French conversation transcripts. Supervision is given by a latent model bootstrapped by manually defined heuristic rules that use linguistic and acoustic information. The resulting model improves the original segmenter, especially in contexts where information on speaker turns is lacking or noisy, gaining up to 13% in F-score. Evaluation is performed on data like those used to define our heuristic rules, but also on transcripts from two other corpora.

Multi-lingual Discourse Segmentation and Connective Identification: MELODI at Disrpt2021
Morteza Kamaladdini Ezzabady | Philippe Muller | Chloé Braud
Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021)

We present an approach for discourse segmentation and discourse connective identification, both at the sentence and document level, within the Disrpt 2021 shared task, a multi-lingual and multi-formalism evaluation campaign. Building on the most successful architecture from the 2019 similar shared task, we leverage datasets in the same or similar languages to augment training data and improve on the best systems from the previous campaign on 3 out of 4 subtasks, with a mean improvement on all 16 datasets of 0.85%. Within the Disrpt 21 campaign the system ranks 3rd overall, very close to the 2nd system, but with a significant gap with respect to the best system, which uses a rich set of additional features. The system is nonetheless the best on languages that benefited from crosslingual training on sentence internal segmentation (German and Spanish).

The DISRPT 2021 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification
Amir Zeldes | Yang Janet Liu | Mikel Iruskieta | Philippe Muller | Chloé Braud | Sonia Badene
Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021)

In 2021, we organized the second iteration of a shared task dedicated to the underlying units used in discourse parsing across formalisms: the DISRPT Shared Task (Discourse Relation Parsing and Treebanking). Adding to the 2019 tasks on Elementary Discourse Unit Segmentation and Connective Detection, this iteration of the Shared Task included for the first time a track on discourse relation classification across three formalisms: RST, SDRT, and PDTB. In this paper we review the data included in the Shared Task, which covers nearly 3 million manually annotated tokens from 16 datasets in 11 languages, survey and compare submitted systems and report on system performance on each task for both annotated and plain-tokenized versions of the data.

Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021)
Amir Zeldes | Yang Janet Liu | Mikel Iruskieta | Philippe Muller | Chloé Braud | Sonia Badene
Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021)

2020

Introduction to the Special Issue on Dialogue and Dialogue Systems
Liesbeth Degand | Philippe Muller
Traitement Automatique des Langues, Volume 61, Numéro 3 : Dialogue et systèmes de dialogue [Dialogue and dialogue systems]

Traitement Automatique des Langues, Volume 61, Numéro 3 : Dialogue et systèmes de dialogue [Dialogue and dialogue systems]
Liesbeth Degand | Philippe Muller
Traitement Automatique des Langues, Volume 61, Numéro 3 : Dialogue et systèmes de dialogue [Dialogue and dialogue systems]

DiscSense: Automated Semantic Analysis of Discourse Markers
Damien Sileo | Tim Van de Cruys | Camille Pradel | Philippe Muller
Proceedings of the Twelfth Language Resources and Evaluation Conference

Using a model trained to predict discourse markers between sentence pairs, we predict plausible markers between sentence pairs with a known semantic relation (provided by existing classification datasets). These predictions allow us to study the link between discourse markers and the semantic relations annotated in classification datasets. Handcrafted mappings have been proposed between markers and discourse relations on a limited set of markers and a limited set of categories, but there exists hundreds of discourse markers expressing a wide variety of relations, and there is no consensus on the taxonomy of relations between competing discourse theories (which are largely built in a top-down fashion). By using an automatic prediction method over existing semantically annotated datasets, we provide a bottom-up characterization of discourse markers in English. The resulting dataset, named DiscSense, is publicly available.

2019

Which aspects of discourse relations are hard to learn? Primitive decomposition for discourse relation classification
Charlotte Roze | Chloé Braud | Philippe Muller
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Discourse relation classification has proven to be a hard task, with rather low performance on several corpora that notably differ on the relation set they use. We propose to decompose the task into smaller, mostly binary tasks corresponding to various primitive concepts encoded into the discourse relation definitions. More precisely, we translate the discourse relations into a set of values for attributes based on distinctions used in the mappings between discourse frameworks proposed by Sanders et al. (2018). This arguably allows for a more robust representation of discourse relations, and enables us to address usually ignored aspects of discourse relation prediction, namely multiple labels and underspecified annotations. We show experimentally which of the conceptual primitives are harder to learn from the Penn Discourse Treebank English corpus, and propose a correspondence to predict the original labels, with preliminary empirical comparisons with a direct model.

ToNy: Contextual embeddings for accurate multilingual discourse segmentation of full documents
Philippe Muller | Chloé Braud | Mathieu Morey
Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Segmentation is the first step in building practical discourse parsers, and is often neglected in discourse parsing studies. The goal is to identify the minimal spans of text to be linked by discourse relations, or to isolate explicit marking of discourse relations. Existing systems on English report F1 scores as high as 95%, but they generally assume gold sentence boundaries and are restricted to English newswire texts annotated within the RST framework. This article presents a generic approach and a system, ToNy, a discourse segmenter developed for the DisRPT shared task where multiple discourse representation schemes, languages and domains are represented. In our experiments, we found that a straightforward sequence prediction architecture with pretrained contextual embeddings is sufficient to reach performance levels comparable to existing systems, when separately trained on each corpus. We report performance between 81% and 96% in F1 score. We also observed that discourse segmentation models only display a moderate generalization capability, even within the same language and discourse representation scheme.

Composition of Sentence Embeddings: Lessons from Statistical Relational Learning
Damien Sileo | Tim Van De Cruys | Camille Pradel | Philippe Muller
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Various NLP problems – such as the prediction of sentence similarity, entailment, and discourse relations – are all instances of the same general task: the modeling of semantic relations between a pair of textual elements. A popular model for such problems is to embed sentences into fixed size vectors, and use composition functions (e.g. concatenation or sum) of those vectors as features for the prediction. At the same time, composition of embeddings has been a main focus within the field of Statistical Relational Learning (SRL) whose goal is to predict relations between entities (typically from knowledge base triples). In this article, we show that previous work on relation prediction between texts implicitly uses compositions from baseline SRL models. We show that such compositions are not expressive enough for several tasks (e.g. natural language inference). We build on recent SRL models to address textual relational problems, showing that they are more expressive, and can alleviate issues from simpler compositions. The resulting models significantly improve the state of the art in both transferable sentence representation learning and relation prediction.

Mining Discourse Markers for Unsupervised Sentence Representation Learning
Damien Sileo | Tim Van De Cruys | Camille Pradel | Philippe Muller
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Current state of the art systems in NLP heavily rely on manually annotated datasets, which are expensive to construct. Very little work adequately exploits unannotated data – such as discourse markers between sentences – mainly because of data sparseness and ineffective extraction methods. In the present work, we propose a method to automatically discover sentence pairs with relevant discourse markers, and apply it to massive amounts of data. Our resulting dataset contains 174 discourse markers with at least 10k examples each, even for rare markers such as “coincidentally” or “amazingly”. We use the resulting data as supervision for learning transferable sentence embeddings. In addition, we show that even though sentence representation learning through prediction of discourse marker yields state of the art results across different transfer tasks, it’s not clear that our models made use of the semantic relation between sentences, thus leaving room for further improvements.

Aprentissage non-supervisé pour l’appariement et l’étiquetage de cas cliniques en français - DEFT2019 (Unsupervised learning for matching and labelling of French clinical cases - DEFT2019 )
Damien Sileo | Tim Van de Cruys | Philippe Muller | Camille Pradel
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Défi Fouille de Textes (atelier TALN-RECITAL)

Nous présentons le système utilisé par l’équipe Synapse/IRIT dans la compétition DEFT2019 portant sur deux tâches liées à des cas cliniques rédigés en français : l’une d’appariement entre des cas cliniques et des discussions, l’autre d’extraction de mots-clefs. Une des particularité est l’emploi d’apprentissage non-supervisé sur les deux tâches, sur un corpus construit spécifiquement pour le domaine médical en français

Représentation sémantique distributionnelle et alignement de conversations par chat (Distributional semantic representation and alignment of online chat conversations )
Tom Bourgeade | Philippe Muller
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

Les mesures de similarité textuelle ont une place importante en TAL, du fait de leurs nombreuses applications, en recherche d’information et en classification notamment. En revanche, le dialogue fait moins l’objet d’attention sur cette question. Nous nous intéressons ici à la production d’une similarité dans le contexte d’un corpus de conversations par chat à l’aide de méthodes non-supervisées, exploitant à différents niveaux la notion de sémantique distributionnelle, sous forme d’embeddings. Dans un même temps, pour enrichir la mesure, et permettre une meilleure interprétation des résultats, nous établissons des alignements explicites des tours de parole dans les conversations, en exploitant la distance de Wasserstein, qui permet de prendre en compte leur dimension structurelle. Enfin, nous évaluons notre approche à l’aide d’une tâche externe sur la petite partie annotée du corpus, et observons qu’elle donne de meilleurs résultats qu’une variante plus naïve à base de moyennes.

Analyse faiblement supervisée de conversation en actes de dialogue (Weakly supervised dialog act analysis)
Catherine Thompson | Nicholas Asher | Philippe Muller | Jérémy Auguste
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

Nous nous intéressons ici à l’analyse de conversation par chat dans un contexte orienté-tâche avec un conseiller technique s’adressant à un client, où l’objectif est d’étiqueter les énoncés en actes de dialogue, pour alimenter des analyses des conversations en aval. Nous proposons une méthode légèrement supervisée à partir d’heuristiques simples, de quelques annotations de développement, et une méthode d’ensemble sur ces règles qui sert à annoter automatiquement un corpus plus large de façon bruitée qui peut servir d’entrainement à un modèle supervisé. Nous comparons cette approche à une approche supervisée classique et montrons qu’elle atteint des résultats très proches, à un coût moindre et tout en étant plus facile à adapter à de nouvelles données.

2018

A Dependency Perspective on RST Discourse Parsing and Evaluation
Mathieu Morey | Philippe Muller | Nicholas Asher
Computational Linguistics, Volume 44, Issue 2 - June 2018

Computational text-level discourse analysis mostly happens within Rhetorical Structure Theory (RST), whose structures have classically been presented as constituency trees, and relies on data from the RST Discourse Treebank (RST-DT); as a result, the RST discourse parsing community has largely borrowed from the syntactic constituency parsing community. The standard evaluation procedure for RST discourse parsers is thus a simplified variant of PARSEVAL, and most RST discourse parsers use techniques that originated in syntactic constituency parsing. In this article, we isolate a number of conceptual and computational problems with the constituency hypothesis. We then examine the consequences, for the implementation and evaluation of RST discourse parsers, of adopting a dependency perspective on RST structures, a view advocated so far only by a few approaches to discourse parsing. While doing that, we show the importance of the notion of headedness of RST structures. We analyze RST discourse parsing as dependency parsing by adapting to RST a recent proposal in syntactic parsing that relies on head-ordered dependency trees, a representation isomorphic to headed constituency trees. We show how to convert the original trees from the RST corpus, RST-DT, and their binarized versions used by all existing RST parsers to head-ordered dependency trees. We also propose a way to convert existing simple dependency parser output to constituent trees. This allows us to evaluate and to compare approaches from both constituent-based and dependency-based perspectives in a unified framework, using constituency and dependency metrics. We thus propose an evaluation framework to compare extant approaches easily and uniformly, something the RST parsing community has lacked up to now. We can also compare parsers’ predictions to each other across frameworks. This allows us to characterize families of parsing strategies across the different frameworks, in particular with respect to the notion of headedness. Our experiments provide evidence for the conceptual similarities between dependency parsers and shift-reduce constituency parsers, and confirm that dependency parsing constitutes a viable approach to RST discourse parsing.

Concaténation de réseaux de neurones pour la classification de tweets, DEFT2018 (Concatenation of neural networks for tweets classification, DEFT2018 )
Damien Sileo | Tim Van de Cruys | Philippe Muller | Camille Pradel
Actes de la Conférence TALN. Volume 2 - Démonstrations, articles des Rencontres Jeunes Chercheurs, ateliers DeFT

Nous présentons le système utilisé par l’équipe Melodi/Synapse Développement dans la compétition DEFT2018 portant sur la classification de thématique ou de sentiments de tweets en français. On propose un système unique pour les deux approches qui combine concaténativement deux méthodes d’embedding et trois modèles de représentation séquence. Le système se classe 1/13 en analyse de sentiments et 4/13 en classification thématique.

2017

How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT
Mathieu Morey | Philippe Muller | Nicholas Asher
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

This article evaluates purported progress over the past years in RST discourse parsing. Several studies report a relative error reduction of 24 to 51% on all metrics that authors attribute to the introduction of distributed representations of discourse units. We replicate the standard evaluation of 9 parsers, 5 of which use distributed representations, from 8 studies published between 2013 and 2017, using their predictions on the test set of the RST-DT. Our main finding is that most recently reported increases in RST discourse parser performance are an artefact of differences in implementations of the evaluation procedure. We evaluate all these parsers with the standard Parseval procedure to provide a more accurate picture of the actual RST discourse parsers performance in standard evaluation settings. Under this more stringent procedure, the gains attributable to distributed representations represent at most a 16% relative error reduction on fully-labelled structures.

Changement stylistique de phrases par apprentissage faiblement supervisé (Textual Style Transfer using Weakly Supervised Learning)
Damien Sileo | Camille Pradel | Philippe Muller | Tim Van de Cruys
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 2 - Articles courts

Plusieurs tâches en traitement du langage naturel impliquent de modifier des phrases en conservant au mieux leur sens, comme la reformulation, la compression, la simplification, chacune avec leurs propres données et modèles. Nous introduisons ici une méthode générale s’adressant à tous ces problèmes, utilisant des données plus simples à obtenir : un ensemble de phrases munies d’indicateurs sur leur style, comme des phrases et le type de sentiment qu’elles expriment. Cette méthode repose sur un modèle d’apprentissage de représentations non supervisé (un auto-encodeur variationnel), puis sur le changement des représentations apprises pour correspondre à un style donné. Le résultat est évalué qualitativement, puis quantitativement sur le jeu de données de compression de phrases Microsoft, avec des résultats encourageants.

2016

A General Framework for the Annotation of Causality Based on FrameNet
Laure Vieu | Philippe Muller | Marie Candito | Marianne Djemaa
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present here a general set of semantic frames to annotate causal expressions, with a rich lexicon in French and an annotated corpus of about 5000 instances of causal lexical items with their corresponding semantic frames. The aim of our project is to have both the largest possible coverage of causal phenomena in French, across all parts of speech, and have it linked to a general semantic framework such as FN, to benefit in particular from the relations between other semantic frames, e.g., temporal ones or intentional ones, and the underlying upper lexical ontology that enable some forms of reasoning. This is part of the larger ASFALDA French FrameNet project, which focuses on a few different notional domains which are interesting in their own right (Djemma et al., 2016), including cognitive positions and communication frames. In the process of building the French lexicon and preparing the annotation of the corpus, we had to remodel some of the frames proposed in FN based on English data, with hopefully more precise frame definitions to facilitate human annotation. This includes semantic clarifications of frames and frame elements, redundancy elimination, and added coverage. The result is arguably a significant improvement of the treatment of causality in FN itself.

Corpus Annotation within the French FrameNet: a Domain-by-domain Methodology
Marianne Djemaa | Marie Candito | Philippe Muller | Laure Vieu
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper reports on the development of a French FrameNet, within the ASFALDA project. While the first phase of the project focused on the development of a French set of frames and corresponding lexicon (Candito et al., 2014), this paper concentrates on the subsequent corpus annotation phase, which focused on four notional domains (commercial transactions, cognitive stances, causality and verbal communication). Given full coverage is not reachable for a relatively “new” FrameNet project, we advocate that focusing on specific notional domains allowed us to obtain full lexical coverage for the frames of these domains, while partially reflecting word sense ambiguities. Furthermore, as frames and roles were annotated on two French Treebanks (the French Treebank (Abeillé and Barrier, 2004) and the Sequoia Treebank (Candito and Seddah, 2012), we were able to extract a syntactico-semantic lexicon from the annotated frames. In the resource’s current status, there are 98 frames, 662 frame evoking words, 872 senses, and about 13000 annotated frames, with their semantic roles assigned to portions of text. The French FrameNet is freely available at alpage.inria.fr/asfalda.

A Supervised Approach for Enriching the Relational Structure of Frame Semantics in FrameNet
Shafqat Mumtaz Virk | Philippe Muller | Juliette Conrath
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Frame semantics is a theory of linguistic meanings, and is considered to be a useful framework for shallow semantic analysis of natural language. FrameNet, which is based on frame semantics, is a popular lexical semantic resource. In addition to providing a set of core semantic frames and their frame elements, FrameNet also provides relations between those frames (hence providing a network of frames i.e. FrameNet). We address here the limited coverage of the network of conceptual relations between frames in FrameNet, which has previously been pointed out by others. We present a supervised model using rich features from three different sources: structural features from the existing FrameNet network, information from the WordNet relations between synsets projected into semantic frames, and corpus-collected lexical associations. We show large improvements over baselines consisting of each of the three groups of features in isolation. We then use this model to select frame pairs as candidate relations, and perform evaluation on a sample with good precision.

2014

Presentation of the SemDis 2014 workshop: distributional semantics for two tasks - lexical substitution and exploration of specialized corpora (Présentation de l’atelier SemDis 2014 : sémantique distributionnelle pour la substitution lexicale et l’exploration de corpus spécialisés) [in French]
Cécile Fabre | Nabil Hathout | Lydia-Mai Ho-Dac | François Morlane-Hondère | Philippe Muller | Franck Sajous | Ludovic Tanguy | Tim Van de Cruys
TALN-RECITAL 2014 Workshop SemDis 2014 : Enjeux actuels de la sémantique distributionnelle (SemDis 2014: Current Challenges in Distributional Semantics)

TALN-RECITAL 2014 Workshop SemDis 2014 : Enjeux actuels de la sémantique distributionnelle (SemDis 2014: Current Challenges in Distributional Semantics)
Cécile Fabre | Nabil Hathout | Lydia-Mai Ho-Dac | François Morlane-Hondère | Philippe Muller | Franck Sajous | Ludovic Tanguy | Tim Van de Cruys
TALN-RECITAL 2014 Workshop SemDis 2014 : Enjeux actuels de la sémantique distributionnelle (SemDis 2014: Current Challenges in Distributional Semantics)

Predicting the relevance of distributional semantic similarity with contextual information
Philippe Muller | Cécile Fabre | Clémentine Adam
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Developing a French FrameNet: Methodology and First results
Marie Candito | Pascal Amsili | Lucie Barque | Farah Benamara | Gaël de Chalendar | Marianne Djemaa | Pauline Haas | Richard Huyghe | Yvette Yannick Mathieu | Philippe Muller | Benoît Sagot | Laure Vieu
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the first part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain.

Unsupervised extraction of semantic relations (Extraction non supervisée de relations sémantiques lexicales) [in French]
Juliette Conrath | Stergos Afantenos | Nicholas Asher | Philippe Muller
Proceedings of TALN 2014 (Volume 1: Long Papers)

Unsupervised extraction of semantic relations using discourse cues
Juliette Conrath | Stergos Afantenos | Nicholas Asher | Philippe Muller
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

Expressivity and comparison of models of discourse structure
Antoine Venant | Nicholas Asher | Philippe Muller | Pascal Denis | Stergos Afantenos
Proceedings of the SIGDIAL 2013 Conference

MELODI: A Supervised Distributional Approach for Free Paraphrasing of Noun Compounds
Tim Van de Cruys | Stergos Afantenos | Philippe Muller
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

MELODI: Semantic Similarity of Words and Compositional Phrases using Latent Vector Weighting
Tim Van de Cruys | Stergos Afantenos | Philippe Muller
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

Evaluer et améliorer une ressource distributionnelle: protocole d’annotation de liens sémantiques en contexte [Evaluating and improving a distributional resource: protocol for in-context annotation of semantic links]
Clémentine Adam | Cécile Fabre | Philippe Muller
Traitement Automatique des Langues, Volume 54, Numéro 1 : Varia [Varia]

2012

An empirical resource for discovering cognitive principles of discourse organisation: the ANNODIS corpus
Stergos Afantenos | Nicholas Asher | Farah Benamara | Myriam Bras | Cécile Fabre | Mai Ho-dac | Anne Le Draoulec | Philippe Muller | Marie-Paule Péry-Woodley | Laurent Prévot | Josette Rebeyrolles | Ludovic Tanguy | Marianne Vergez-Couret | Laure Vieu
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the ANNODIS resource, a discourse-level annotated corpus for French. The corpus combines two perspectives on discourse: a bottom-up approach and a top-down approach. The bottom-up view incrementally builds a structure from elementary discourse units, while the top-down view focuses on the selective annotation of multi-level discourse structures. The corpus is composed of texts that are diversified with respect to genre, length and type of discursive organisation. The methodology followed here involves an iterative design of annotation guidelines in order to reach satisfactory inter-annotator agreement levels. This allows us to raise a few issues relevant for the comparison of such complex objects as discourse structures. The corpus also serves as a source of empirical evidence for discourse theories. We present here two first analyses taking advantage of this new annotated corpus --one that tested hypotheses on constraints governing discourse structure, and another that studied the variations in composition and signalling of multi-level discourse structures.

Constrained Decoding for Text-Level Discourse Parsing
Philippe Muller | Stergos Afantenos | Pascal Denis | Nicholas Asher
Proceedings of COLING 2012

Préface [Introduction to the special issue]
Interjeet Mani | Philippe Muller
Traitement Automatique des Langues, Volume 53, Numéro 2 : Traitement automatique des informations temporelles et spatiales en langage naturel [Automatic Processing for Temporal and Spatial Information in Natural Language]

2011

Comparaison d’une approche miroir et d’une approche distributionnelle pour l’extraction de mots sémantiquement reliés (Comparing a mirror approach and a distributional approach for extracting semantically related words)
Philippe Muller | Philippe Langlais
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Dans (Muller & Langlais, 2010), nous avons comparé une approche distributionnelle et une variante de l’approche miroir proposée par Dyvik (2002) sur une tâche d’extraction de synonymes à partir d’un corpus en français. Nous présentons ici une analyse plus fine des relations extraites automatiquement en nous intéressant cette fois-ci à la langue anglaise pour laquelle de plus amples ressources sont disponibles. Différentes façons d’évaluer notre approche corroborent le fait que l’approche miroir se comporte globalement mieux que l’approche distributionnelle décrite dans (Lin, 1998), une approche de référence dans le domaine.

2010

Learning Recursive Segments for Discourse Parsing
Stergos Afantenos | Pascal Denis | Philippe Muller | Laurence Danlos
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse, like the ""Segmented Discourse Representation Theory"" or SDRT, allow for nested discourse units. In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. Our approach builds on standard multi-class classification techniques making use of a regularized maximum entropy model, combined with a simple repairing heuristic that enforces global coherence. Our system was developed and evaluated on the first round of annotations provided by the French Annodis project (an ongoing effort to create a discourse bank for French). Cross-validated on only 47 documents (1,445 EDUs), our system achieves encouraging performance results with an F-score of 73% for finding EDUs.

Comparison of different algebras for inducing the temporal structure of texts
Pascal Denis | Philippe Muller
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

Une évaluation de l’impact des types de textes sur la tâche de segmentation thématique
Clémentine Adam | Philippe Muller | Cécile Fabre
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cette étude a pour but de contribuer à la définition des objectifs de la segmentation thématique (ST), en incitant à prendre en considération le paramètre du type de textes dans cette tâche. Notre hypothèse est que, si la ST est certes pertinente pour traiter certains textes dont l’organisation est bien thématique, elle n’est pas adaptée à la prise en compte d’autres modes d’organisation (temporelle, rhétorique), et ne peut pas être appliquée sans précaution à des textes tout-venants. En comparant les performances d’un système de ST sur deux corpus, à organisation thématique “forte” et “faible”, nous montrons que cette tâche est effectivement sensible à la nature des textes.

Comparaison de ressources lexicales pour l’extraction de synonymes
Philippe Muller | Philippe Langlais
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

2009

Le projet ANNODIS vise la construction d’un corpus de textes annotés au niveau discursif ainsi que le développement d’outils pour l’annotation et l’exploitation de corpus. Les annotations adoptent deux points de vue complémentaires : une perspective ascendante part d’unités de discours minimales pour construire des structures complexes via un jeu de relations de discours ; une perspective descendante aborde le texte dans son entier et se base sur des indices pré-identifiés pour détecter des structures discursives de haut niveau. La construction du corpus est associée à la création de deux interfaces : la première assiste l’annotation manuelle des relations et structures discursives en permettant une visualisation du marquage issu des prétraitements ; une seconde sera destinée à l’exploitation des annotations. Nous présentons les modèles et protocoles d’annotation élaborés pour mettre en oeuvre, au travers de l’interface dédiée, la campagne d’annotation.

2008

Evaluation Metrics for Automatic Temporal Annotation of Texts
Xavier Tannier | Philippe Muller
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Recent years have seen increasing attention in temporal processing of texts as well as a lot of standardization effort of temporal information in natural language. A central part of this information lies in the temporal relations between events described in a text, when their precise times or dates are not known. Reliable human annotation of such information is difficult, and automatic comparisons must follow procedures beyond mere precision-recall of local pieces of information, since a coherent picture can only be considered at a global level. We address the problem of evaluation metrics of such information, aiming at fair comparisons between systems, by proposing some measures taking into account the globality of a text.

Annotation d’expressions temporelles et d’événements en français
Gabriel Parent | Michel Gagnon | Philippe Muller
Actes de la 15ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Dans cet article, nous proposons une méthode pour identifier, dans un texte en français, l’ensemble des expressions adverbiales de localisation temporelle, ainsi que tous les verbes, noms et adjectifs dénotant une éventualité (événement ou état). Cette méthode, en plus d’identifier ces expressions, extrait certaines informations sémantiques : la valeur de la localisation temporelle selon la norme TimeML et le type des éventualités. Pour les expressions adverbiales de localisation temporelle, nous utilisons une cascade d’automates, alors que pour l’identification des événements et états nous avons recours à une analyse complète de la phrase. Nos résultats sont proches de travaux comparables sur l’anglais, en l’absence d’évaluation quantitative similaire sur le français.

2007

Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Posters
Nabil Hathout | Philippe Muller
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs
Nabil Hathout | Philippe Muller
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations
Nabil Hathout | Philippe Muller
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

2006

Synonym Extraction Using a Semantic Distance on a Dictionary
Philippe Muller | Nabil Hathout | Bruno Gaume
Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing

2004

Word Sense Disambiguation using a dictionary for sense similarity measure
Bruno Gaume | Nabil Hathout | Philippe Muller
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

Annotating and measuring temporal relations in texts
Philippe Muller | Xavier Tannier
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

Désambiguïsation par proximité structurelle
Bruno Gaume | Nabil Hathout | Philippe Muller
Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

L’article présente une méthode de désambiguïsation dans laquelle le sens est déterminé en utilisant un dictionnaire. La méthode est basée sur un algorithme qui calcule une distance « sémantique » entre les mots du dictionnaire en prenant en compte la topologie complète du dictionnaire, vu comme un graphe sur ses entrées. Nous l’avons testée sur la désambiguïsation des définitions du dictionnaire elles-mêmes. L’article présente des résultats préliminaires, qui sont très encourageants pour une méthode ne nécessitant pas de corpus annoté.

Une méthode pour l’annotation de relations temporelles dans des textes et son évaluation
Philippe Muller | Xavier Tannier
Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article traite de l’annotation automatique d’informations temporelles dans des textes et vise plus particulièrement les relations entre événements introduits par les verbes dans chaque clause. Si ce problème a mobilisé beaucoup de chercheurs sur le plan théorique, il reste en friche pour ce qui est de l’annotation automatique systématique (et son évaluation), même s’il existe des débuts de méthodologie pour faire réaliser la tâche par des humains. Nous proposons ici à la fois une méthode pour réaliser la tâche automatiquement et une manière de mesurer à quel degré l’objectif est atteint. Nous avons testé la faisabilité de ceci sur des dépêches d’agence avec des premiers résultats encourageants.

Co-authors

Cécile Fabre 7

Yang Janet Liu 7

Camille Pradel 7

Eleni Metheniti 5

Lydia-Mai Ho-Dac 4

Laurent Prévot 4

Laura Rivière 4

Ludovic Tanguy 4

Clémentine Adam 3

Farah Benamara 3

Marie Candito 3

Juliette Conrath 3

Nicolas Devatine 3

Marianne Djemaa 3

Zhengyuan Liu 3

Mathieu Morey 3

Xavier Tannier 3

Tom Bourgeade 2

Liesbeth Degand 2

Mikel Iruskieta 2

Philippe Langlais 2

Anne Le Draoulec 2

François Morlane-Hondère 2

Galann Pennec 2

Marie-Paule Pery-Woodley 2

Attapol Rutherford 2

Franck Sajous 2

Marianne Vergez-Couret 2

Pascal Amsili 1

Tatsuya Aoyama 1

Nicholas Asher 1

Jérémy Auguste 1

Laurence Danlos 1

Patrice Enjalbert 1

Isabelle Ferrané 1

Stéphane Ferrari 1

Michel Gagnon 1

Lila Gravellier 1

Margarita Hernández Casas 1

Richard Huyghe 1

Khushnur Jahangir 1

Morteza Kamaladdini Ezzabady 1

Khalil Maachou 1

Interjeet Mani 1

Yannick Mathieu 1

Gabriel Parent 1

Thomas Pellegrini 1

Josette Rebeyrolle 1

Josette Rebeyrolles 1

Firmin Rousseau 1

Charlotte Roze 1

Benoît Sagot 1

Ayrton San Joaquin 1

Catherine Thompson 1

Antoine Venant 1

Shafqat Mumtaz Virk 1

Antoine Widlöcher 1

Gaël de Chalendar 1

Venues