LC-Score: Reference-less estimation of Text Comprehension Difficulty
Paul Tardy | Charlotte Roze | Paul Poupet
Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability

Being able to read and understand written text is critical in a digital era. However, studies shows that a large fraction of the population experiences comprehension issues. In this context, further initiatives in accessibility are required to improve the audience text comprehension. However, writers are hardly assisted nor encouraged to produce easy-to-understand content. Moreover, Automatic Text Simplification (ATS) model development suffers from the lack of metric to accurately estimate comprehension difficulty. We present LC-SCORE, a simple approach for training text comprehension metric for any text without reference i.e. predicting how easy to understand a given text is on a [0, 100] scale. Our objective with this scale is to quantitatively capture the extend to which a text suits to the Langage Clair (LC, Clear Language) guidelines, a French initiative closely related to English Plain Language. We explore two approaches: (i) using linguistically motivated indicators used to train statistical models, and (ii) neural learning directly from text leveraging pre-trained language models. We introduce a simple proxy task for comprehension difficulty training as a classification task. To evaluate our models, we run two distinct human annotation experiments, and find that both approaches (indicator based and neural) outperforms commonly used readability and comprehension metrics such as FKGL.

An automated tool with human supervision to adapt difficult texts into Plain Language
Paul Poupet | Morgane Hauguel | Erwan Boehm | Charlotte Roze | Paul Tardy
Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability

In this paper, we present an automated tool with human supervision to write in plain language or to adapt difficult texts into plain language. It can be used on a web version and as a plugin for Word/Outlook plugins. At the publication date, it is only available in the French language. This tool has been developed for 3 years and has been used by 400 users from private companies and from public administrations. Text simplification is automatically performed with the manual approval of the user, at the lexical, syntactic, and discursive levels. Screencast of the demo can be found at the following link:


Aligning Discourse and Argumentation Structures using Subtrees and Redescription Mining
Laurine Huber | Yannick Toussaint | Charlotte Roze | Mathilde Dargnat | Chloé Braud
Proceedings of the 6th Workshop on Argument Mining

In this paper, we investigate similarities between discourse and argumentation structures by aligning subtrees in a corpus containing both annotations. Contrary to previous works, we focus on comparing sub-structures and not only relations matches. Using data mining techniques, we show that discourse and argumentation most often align well, and the double annotation allows to derive a mapping between structures. Moreover, this approach enables the study of similarities between discourse structures and differences in their expressive power.

Which aspects of discourse relations are hard to learn? Primitive decomposition for discourse relation classification
Charlotte Roze | Chloé Braud | Philippe Muller
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Discourse relation classification has proven to be a hard task, with rather low performance on several corpora that notably differ on the relation set they use. We propose to decompose the task into smaller, mostly binary tasks corresponding to various primitive concepts encoded into the discourse relation definitions. More precisely, we translate the discourse relations into a set of values for attributes based on distinctions used in the mappings between discourse frameworks proposed by Sanders et al. (2018). This arguably allows for a more robust representation of discourse relations, and enables us to address usually ignored aspects of discourse relation prediction, namely multiple labels and underspecified annotations. We show experimentally which of the conceptual primitives are harder to learn from the Penn Discourse Treebank English corpus, and propose a correspondence to predict the original labels, with preliminary empirical comparisons with a direct model.


CommunicoTool Advance, un prototype d’application d’aide à la communication (CommunicoTool Advance: an assistive communication app prototype)
Charlotte Roze
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 5 : Démonstrations

CommunicoTool Advance est un prototype d’application mobile d’aide à la communication destinée à des personnes qui présentent des troubles moteurs et des troubles de la parole.


Identification of Shell Nouns, Signals of Discourse Organisation (Identification des noms sous-spécifiés, signaux de l’organisation discursive) [in French]
Charlotte Roze | Thierry Charnois | Dominique Legallois | Stéphane Ferrari | Mathilde Salles
Proceedings of TALN 2014 (Volume 1: Long Papers)


Vers le FDTB : French Discourse Tree Bank (Towards the FDTB : French Discourse Tree Bank) [in French]
Laurence Danlos | Diégo Antolinos-Basso | Chloé Braud | Charlotte Roze
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN


Traduction (automatique) des connecteurs de discours ((Machine) Translation of discourse connectors)
Laurence Danlos | Charlotte Roze
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

En nous appuyant sur des données fournies par le concordancier bilingue TransSearch qui intègre un alignement statistique au niveau des mots, nous avons effectué une annotation semi-manuelle de la traduction anglaise de deux connecteurs du français. Les résultats de cette annotation montrent que les traductions de ces connecteurs ne correspondent pas aux « transpots » identifiés par TransSearch et encore moins à ce qui est proposé dans les dictionnaires bilingues.

Vers une algèbre des relations de discours pour la comparaison de structures discursives
Charlotte Roze
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues

Nous proposons une méthodologie pour la construction de règles de déduction de relations de discours, destinées à être intégrées dans une algèbre de ces relations. La construction de ces règles a comme principal objectif de pouvoir calculer la fermeture discursive d’une structure de discours, c’est-à-dire de déduire toutes les relations que la structure contient implicitement. Calculer la fermeture des structures discursives peut permettre d’améliorer leur comparaison, notamment dans le cadre de l’évaluation de systèmes d’analyse automatique du discours. Nous présentons la méthodologie adoptée, que nous illustrons par l’étude d’une règle de déduction.