Bruno Oberle


2020

pdf bib
Coreference-Based Text Simplification
Rodrigo Wilkens | Bruno Oberle | Amalia Todirascu
Proceedings of the 1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI)

Text simplification aims at adapting documents to make them easier to read by a given audience. Usually, simplification systems consider only lexical and syntactic levels, and, moreover, are often evaluated at the sentence level. Thus, studies on the impact of simplification in text cohesion are lacking. Some works add coreference resolution in their pipeline to address this issue. In this paper, we move forward in this direction and present a rule-based system for automatic text simplification, aiming at adapting French texts for dyslexic children. The architecture of our system takes into account not only lexical and syntactic but also discourse information, based on coreference chains. Our system has been manually evaluated in terms of grammaticality and cohesion. We have also built and used an evaluation corpus containing multiple simplification references for each sentence. It has been annotated by experts following a set of simplification guidelines, and can be used to run automatic evaluation of other simplification systems. Both the system and the evaluation corpus are freely available.

pdf bib
French Coreference for Spoken and Written Language
Rodrigo Wilkens | Bruno Oberle | Frédéric Landragin | Amalia Todirascu
Proceedings of the Twelfth Language Resources and Evaluation Conference

Coreference resolution aims at identifying and grouping all mentions referring to the same entity. In French, most systems run different setups, making their comparison difficult. In this paper, we present an extensive comparison of several coreference resolution systems for French. The systems have been trained on two corpora (ANCOR for spoken language and Democrat for written language) annotated with coreference chains, and augmented with syntactic and semantic information. The models are compared with different configurations (e.g. with and without singletons). In addition, we evaluate mention detection and coreference resolution apart. We present a full-stack model that outperforms other approaches. This model allows us to study the impact of mention detection errors on coreference resolution. Our analysis shows that mention detection can be improved by focusing on boundary identification while advances in the pronoun-noun relation detection can help the coreference task. Another contribution of this work is the first end-to-end neural French coreference resolution model trained on Democrat (written texts), which compares to the state-of-the-art systems for oral French.

2019

pdf bib
Détection automatique de chaînes de coréférence pour le français écrit: règles et ressources adaptées au repérage de phénomènes linguistiques spécifiques (Automatic coreference resolution for written French : rules and resources for specific linguistic phenomena)
Bruno Oberle
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume III : RECITAL

Nous présentons un système end-to-end de détection automatique des chaînes de coréférence, à base de règles, pour le français écrit. Ce système insiste sur la prise en compte de phénomènes linguistiques négligés par d’autres systèmes. Nous avons élaboré des ressources lexicales pour la résolution des anaphores infidèles (Mon chat... Cet animal...), notamment lorsqu’elles incluent une entité nommée (La Seine... Ce fleuve...). Nous utilisons également des règles pour le repérage de mentions de groupes (Pierre et Paul) et d’anaphores zéros (Pierre boit et ø fume), ainsi que des règles pour la détection des pronoms de première et deuxième personnes dans les citations (Paul a dit : “Je suis étudiant.”). L’article présente l’élaboration des ressources et règles utilisées pour la gestion de ces phénomènes spécifiques, avant de décrire le système dans son ensemble, et notamment les différentes phases de la résolution de la coréférence.

2018

pdf bib
SACR: A Drag-and-Drop Based Tool for Coreference Annotation
Bruno Oberle
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)