Laurence Danlos - ACL Anthology

Laurence Danlos

2018

Primary and secondary discourse connectives: definitions and lexicons
Laurence Danlos | Katerina Rysova | Magdalena Rysova | Manfred Stede
Dialogue Discourse Volume 9

Starting from the perspective that discourse structure arises from the presence of coherence relations, we provide a map of linguistic discourse structuring devices (DRDs), and focus on those for written text. We propose to structure these items by differentiating between primary and secondary connectives on the one hand, and free connecting phrases on the other. For the former, we propose that their behavior can be described by lexicons, and we show one concrete proposal that by now has been applied to three languages, with others being added in ongoing work. The lexical representations can be useful both for humans (theoretical investigations, transfer to other languages) and for machines (automatic discourse parsing and generation).

Discourse and Lexicons: Lexemes, MWEs, Grammatical Constructions and Compositional Word Combinations to Signal Discourse Relations
Laurence Danlos
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

Lexicons generally record a list of lexemes or non-compositional multiword expressions. We propose to build lexicons for compositional word combinations, namely “secondary discourse connectives”. Secondary discourse connectives play the same function as “primary discourse connectives” but the latter are either lexemes or non-compositional multiword expressions. The paper defines primary and secondary connectives, and explains why it is possible to build a lexicon for the compositional ones and how it could be organized. It also puts forward the utility of such a lexicon in discourse annotation and parsing. Finally, it opens the discussion on the constructions that signal a discourse relation between two spans of text.

2016

Un Verbenet du français [A Verbnet for French]
Laurence Danlos | Quentin Pradet | Lucie Barque | Takuya Nakamura | Matthieu Constant
Traitement Automatique des Langues, Volume 57, Numéro 1 : Varia [Varia]

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Articles longs)

Interfacing Sentential and Discourse TAG-based Grammars
Laurence Danlos | Aleksandre Maskharashvili | Sylvain Pogodalla
Proceedings of the 12th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+12)

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 3 : RECITAL
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 3 : RECITAL

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 4 : Conférences invitées
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 4 : Conférences invitées

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Posters)
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Posters)

Modelling Discourse in STAG: Subordinate Conjunctions and Attributing Phrases
Timothée Bernard | Laurence Danlos
Proceedings of the 12th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+12)

Improvement of VerbNet-like resources by frame typing
Laurence Danlos | Matthieu Constant | Lucie Barque
Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

Verbenet is a French lexicon developed by “translation” of its English counterpart — VerbNet (Kipper-Schuler, 2005)—and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a “virtuous circle”. We present the principles underlying a program we developed and used to automatically type frames in VerbeNet. We also show that our system is portable to other languages.

Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 5 : Démonstrations
Laurence Danlos | Thierry Hamon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 5 : Démonstrations

2015

Grammaires phrastiques et discursives fondées sur les TAG : une approche de D-STAG avec les ACG
Laurence Danlos | Aleksandre Maskharashvili | Sylvain Pogodalla
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Nous présentons une méthode pour articuler grammaire de phrase et grammaire de discours qui évite de recourir à une étape de traitement intermédiaire. Cette méthode est suffisamment générale pour construire des structures discursives qui ne soient pas des arbres mais des graphes orientés acycliques (DAG). Notre analyse s’appuie sur une approche de l’analyse discursive, Discourse Synchronous TAG (D-STAG), qui utilise les Grammaires d’Arbres Adjoint (TAG). Nous utilisons pour ce faire un encodage des TAG dans les Grammaires Catégorielles Abstraites (ACG). Cet encodage permet d’une part d’utiliser l’ordre supérieur pour l’interprétation sémantique afin de construire des structures qui soient des DAG et non des arbres, et d’autre part d’utiliser les propriétés de composition d’ACG pour réaliser naturellement l’interface entre grammaire phrastique et grammaire discursive. Tous les exemples proposés pour illustrer la méthode ont été implantés et peuvent être testés avec le logiciel approprié.

FDTB1: Repérage des connecteurs de discours en corpus
Jacques Steinlin | Margot Colinet | Laurence Danlos
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Cet article présente le repérage manuel des connecteurs de discours dans le corpus FTB (French Treebank) déjà annoté pour la morpho-syntaxe. C’est la première étape de l’annotation discursive complète de ce corpus. Il s’agit de projeter sur le corpus les éléments répertoriés dans LexConn, lexique des connecteurs du français, et de filtrer les occurrences de ces éléments qui n’ont pas un emploi discursif mais par exemple un emploi d’adverbe de manière ou de préposition introduisant un complément sous-catégorisé. Plus de 10 000 connecteurs ont ainsi été repérés.

2014

Sub-categorization in ‘pour’ and lexical syntax (Sous-catégorisation en pour et syntaxe lexicale) [in French]
Benoît Sagot | Laurence Danlos | Margot Colinet
Proceedings of TALN 2014 (Volume 2: Short Papers)

An ACG Analysis of the G-TAG Generation Process
Laurence Danlos | Aleksandre Maskharashvili | Sylvain Pogodalla
Proceedings of the 8th International Natural Language Generation Conference (INLG)

Text Generation: Reexamining G-TAG with Abstract Categorial Grammars (Génération de textes : G-TAG revisité avec les Grammaires Catégorielles Abstraites) [in French]
Laurence Danlos | Aleksandre Maskharashvili | Sylvain Pogodalla
Proceedings of TALN 2014 (Volume 1: Long Papers)

Adapting VerbNet to French using existing resources
Quentin Pradet | Laurence Danlos | Gaël de Chalendar
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

VerbNet is an English lexical resource for verbs that has proven useful for English NLP due to its high coverage and coherent classification. Such a resource doesnt exist for other languages, despite some (mostly automatic and unsupervised) attempts. We show how to semi-automatically adapt VerbNet using existing resources designed for diï¬erent purposes. This study focuses on French and uses two French resources: a semantic lexicon (Les Verbes Français) and a syntactic lexicon (Lexique-Grammaire).

Toward a French VerbeNet (Vers la création d’un Verbnet français) [in French]
Laurence Danlos | Takuyua Nakamura | Quentin Pradet
TALN-RECITAL 2014 Workshop FondamenTAL 2014 : Ressources lexicales et TAL - vue d’ensemble sur les dictionnaires électroniques de Jean Dubois et Françoise Dubois-Charlier (FondamenTAL 2014 : Lexical Resources and NLP)

Because We Say So
Julie Hunter | Laurence Danlos
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

2013

WoNeF, an improved, extended and evaluated automatic French translation of WordNet (WoNeF : amélioration, extension et évaluation d’une traduction française automatique de WordNet) [in French]
Quentin Pradet | Jeanne Baguenier-Desormeaux | Gaël de Chalendar | Laurence Danlos
Proceedings of TALN 2013 (Volume 1: Long Papers)

2012

Vers le FDTB : French Discourse Tree Bank (Towards the FDTB : French Discourse Tree Bank) [in French]
Laurence Danlos | Diégo Antolinos-Basso | Chloé Braud | Charlotte Roze
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

Semantic annotation of French corpora: animacy and verb semantic classes
Juliette Thuilier | Laurence Danlos
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a first corpus of French annotated for animacy and for verb semantic classes. The resource consists of 1,346 sentences extracted from three different corpora: the French Treebank (Abeillé and Barrier, 2004), the Est-Républicain corpus (CNRTL) and the ESTER corpus (ELRA). It is a set of parsed sentences, containing a verbal head subcategorizing two complements, with annotations on the verb and on both complements, in the TIGER XML format (Mengel and Lezius, 2000). The resource was manually annotated and manually corrected by three annotators. Animacy has been annotated following the categories of Zaenen et al. (2004). Measures of inter-annotator agreement are good (Multi-pi = 0.82 and Multi-kappa = 0.86 (k = 3, N = 2360)). As for verb semantic classes, we used three of the five levels of classification of an existing dictionary: 'Les Verbes du Français' (Dubois and Dubois-Charlier, 1997). For the higher level (generic classes), the measures of agreement are Multi-pi = 0.84 and Multi-kappa = 0.87 (k = 3, N = 1346). The inter-annotator agreements show that the annotated data are reliable for both animacy and verbal semantic classes.

2011

Analyse discursive et informations de factivité (Discursive analysis and information factivity)
Laurence Danlos
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Les annotations discursives proposées dans le cadre de théories discursives comme RST (Rhetorical Structure Theory) ou SDRT (Segmented Dicourse Representation Theory) ont comme point fort de construire une structure discursive globale liant toutes les informations données dans un texte. Les annotations discursives proposées dans le PDTB (Penn Discourse Tree Bank) ont comme point fort d’identifier la “source” de chaque information du texte—répondant ainsi à la question qui a dit ou pense quoi ? Nous proposons une approche unifiée pour les annotations discursives alliant les points forts de ces deux courants de recherche. Cette approche unifiée repose crucialement sur des information de factivité, telles que celles qui sont annotées dans le corpus (anglais) FactBank.

EASYTEXT : un système opérationnel de génération de textes (EASYTEXT: an operational system for text generation)
Frédéric Meunier | Laurence Danlos | Vanessa Combet
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

EasyText: an Operational NLG System
Laurence Danlos | Frédéric Meunier | Vanessa Combet
Proceedings of the 13th European Workshop on Natural Language Generation

French TimeBank: An ISO-TimeML Annotated Reference Corpus
André Bittar | Pascal Amsili | Pascal Denis | Laurence Danlos
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Traduction (automatique) des connecteurs de discours ((Machine) Translation of discourse connectors)
Laurence Danlos | Charlotte Roze
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

En nous appuyant sur des données fournies par le concordancier bilingue TransSearch qui intègre un alignement statistique au niveau des mots, nous avons effectué une annotation semi-manuelle de la traduction anglaise de deux connecteurs du français. Les résultats de cette annotation montrent que les traductions de ces connecteurs ne correspondent pas aux « transpots » identifiés par TransSearch et encore moins à ce qui est proposé dans les dictionnaires bilingues.

2010

Ponctuations fortes abusives
Laurence Danlos | Benoît Sagot
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Certaines ponctuations fortes sont « abusivement » utilisées à la place de ponctuations faibles, débouchant sur des phrases graphiques qui ne sont pas des phrases grammaticales. Cet article présente une étude sur corpus de ce phénomène et une ébauche d’outil pour repérer automatiquement les ponctuations fortes abusives.

Control Verb, Argument Cluster Coordination and Multi Component TAG
Djamé Seddah | Benoit Sagot | Laurence Danlos
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)

Learning Recursive Segments for Discourse Parsing
Stergos Afantenos | Pascal Denis | Philippe Muller | Laurence Danlos
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse, like the ""Segmented Discourse Representation Theory"" or SDRT, allow for nested discourse units. In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. Our approach builds on standard multi-class classification techniques making use of a regularized maximum entropy model, combined with a simple repairing heuristic that enforces global coherence. Our system was developed and evaluated on the first round of annotations provided by the French Annodis project (an ongoing effort to create a discourse bank for French). Cross-validated on only 47 documents (1,445 EDUs), our system achieves encouraging performance results with an F-score of 73% for finding EDUs.

A Lexicon of French Quotation Verbs for Automatic Quotation Extraction
Benoît Sagot | Laurence Danlos | Rosa Stern
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Quotation extraction is an important information extraction task, especially when dealing with news wires. Quotations can be found in various configurations. In this paper, we focus on direct quotations introduced by a parenthetical clause, headed by a ""quotation verb"". Our study is based on a large French news wire corpus from the Agence France-Presse. We introduce and motivate an analysis at the discursive level of such quotations, which differs from the syntactic analyses generally proposed. We show how we enriched the Lefff syntactic lexicon so that it provides an account for quotation verbs heading a quotation parenthetical, especially those extracted from a news wire corpus. We also sketch how these lexical entries can be extended to the discursive level in order to model quotations introduced in a parenthetical clause in a complete way.

2009

D-STAG : un formalisme d’analyse automatique de discours fondé sur les TAG synchrones [D-STAG : a discourse analysis formalism based on synchronous TAGs]
Laurence Danlos
Traitement Automatique des Langues, Volume 50, Numéro 1 : Varia [Varia]

Intégration des constructions à verbe support dans TimeML
André Bittar | Laurence Danlos
Actes de la 16ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Le langage TimeML a été conçu pour l’annotation des informations temporelles dans les textes, notamment les événements, les expressions de temps et les relations entre les deux. Des consignes d’annotation générales ont été élaborées afin de guider l’annotateur dans cette tâche, mais certains phénomènes linguistiques restent à traiter en détail. Un problème commun dans les tâches de TAL, que ce soit en traduction, en génération ou en compréhension, est celui de l’encodage des constructions à verbe support. Relativement peu d’attention a été portée, jusqu’à maintenant, sur ce problème dans le cadre du langage TimeML. Dans cet article, nous proposons des consignes d’annotation pour les constructions à verbe support.

2007

Comparaison du Lexique-Grammaire des verbes pleins et de DICOVALENCE : vers une intégration dans le Lefff
Laurence Danlos | Benoît Sagot
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article compare le Lexique-Grammaire des verbes pleins et DICOVALENCE, deux ressources lexicales syntaxiques pour le français développées par des linguistes depuis de nombreuses années. Nous étudions en particulier les divergences et les empiètements des modèles lexicaux sous-jacents. Puis nous présentons le Lefff , lexique syntaxique à grande échelle pour le TAL, et son propre modèle lexical. Nous montrons que ce modèle est à même d’intégrer les informations lexicales présentes dans le Lexique-Grammaire et dans DICOVALENCE. Nous présentons les résultats des premiers travaux effectués en ce sens, avec pour objectif à terme la constitution d’un lexique syntaxique de référence pour le TAL.

D-STAG : un formalisme pour le discours basé sur les TAG synchrones
Laurence Danlos
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Nous proposons D-STAG, un formalisme pour le discours qui utilise les TAG synchrones. Les analyses sémantiques produites par D-STAG sont des structures de discours hiérarchiques annotées de relations de discours coordonnantes ou subordonnantes. Elles sont compatibles avec les structures de discours produites tant en RST qu’en SDRT. Les relations de discours coordonnantes et subordonnantes sont modélisées respectivement par les opérations de substitution et d’adjonction introduites en TAG.

2006

Capacité générative forte de RST, SDRT et des DAG de dépendances pour le discours [Strong generative capacity of RST, SDRT and dependency DAGs for discourse]
Laurence Danlos
Traitement Automatique des Langues, Volume 47, Numéro 2 : Discours et document : traitements automatiques [Computational Approaches to Discourse and Document Processing]

2005

Automatic Recognition of French Expletive Pronoun Occurrences
Laurence Danlos
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

ILIMP: Outil pour repérer les occurences du pronom impersonnel il
Laurence Danlos
Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Nous présentons un outil, ILIMP, qui prend en entrée un texte brut (sans annotation linguistique) rédigé en français et qui fournit en sortie le texte d’entrée où chaque occurrence du pronom il est décorée de la balise [ANAphorique] ou [IMPersonnel]. Cet outil a donc comme fonctionnalité de distinguer les occurrences anaphoriques du pronom il, pour lesquelles un système de résolution des anaphores doit chercher un antécédent, des occurrences où il est un pronom impersonnel (explétif) pour lequel la recherche d’antécédent ne fait pas sens. ILIMP donne un taux de précision de 97,5%. Nous présentons une analyse détaillée des erreurs et nous décrivons brièvement d’autres applications potentielles de la méthode utilisée dans ILIMP, ainsi que l’utilisation et le positionnement d’ILIMP dans un système d’analyse syntaxique modulaire.

2004

Sentences with Two Subordinate Clauses: Syntactic and Semantic Analyses, Underspecified Semantic Representation
Laurence Danlos
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

Discourse Dependency Structures as Constrained DAGs
Laurence Danlos
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004

2003

Représentation sémantique sous-spécifiée pour les conjonctions de subordination
Laurence Danlos
Actes de la 10ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article concerne les phrases complexes avec deux conjonctions de subordination. Nous montrerons que de telles phrases peuvent s’interpréter de quatre façons différentes. Il s’agit donc de formes fortement ambiguës pour lesquelles il est opportun d’avoir recours à des représentations sémantiques sous-spécifiées, et c’est ce que nous proposerons.

2002

A Complete Integrated NLG System Using AI and NLU Tools
Laurence Danlos | Adil El Ghali
COLING 2002: The 19th International Conference on Computational Linguistics

2001

Document Structuring à la SDRT
Laurence Danlos | Bertrand Gaiffe | Laurent Roussarie
Proceedings of the ACL 2001 Eighth European Workshop on Natural Language Generation (EWNLG)

2000

Generating a controlled language
Laurence Danlos | Guy Lapalme | Veronika Lux
INLG’2000 Proceedings of the First International Conference on Natural Language Generation

1998

System Demonstration FLAUBERT: An User Friendly System for Multilingual Text Generation
Frederic Meunier | Laurence Danlos
Natural Language Generation

Linguistic ways for expressing a discourse relation in a lexicalized text generation system
Laurence Danlos
Discourse Relations and Discourse Markers

1992

Translation in the predicative element of a sentence: category switiching, aspect and diathesis
Laurence Danlos | Pollet Samvelian
Proceedings of the Fourth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

1988

Morphology and cross dependencies in the synthesis of personal pronouns in Romance languages
Laurence Danlos | Fiametta Namer
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

SAGE - a Sentence Parsing and Generation System
Jean-Marie Lancel | Miyo Otani | Nathalie Simonin | Laurence Danlos
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

1987

The Linguistic Basis of Text Generation
Laurence Danlos
Third Conference of the European Chapter of the Association for Computational Linguistics

1986

Synthesis of Spoken Messages from Semantic Representations (Semantic-Representation-to-Speech System)
Laurence Danlos | Eric Laporte | Francoise Emerard
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

1984

Conceptual and Linguistic Decisions in Generation
Laurence Danlos
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

Venues