Patrick Watrin


2022

pdf bib
Is Attention Explanation? An Introduction to the Debate
Adrien Bibal | Rémi Cardon | David Alfter | Rodrigo Wilkens | Xiaoou Wang | Thomas François | Patrick Watrin
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The performance of deep learning models in NLP and other fields of machine learning has led to a rise in their popularity, and so the need for explanations of these models becomes paramount. Attention has been seen as a solution to increase performance, while providing some explanations. However, a debate has started to cast doubt on the explanatory power of attention in neural networks. Although the debate has created a vast literature thanks to contributions from various areas, the lack of communication is becoming more and more tangible. In this paper, we provide a clear overview of the insights on the debate by critically confronting works from these different areas. This holistic vision can be of great interest for future works in all the communities concerned by this debate. We sum up the main challenges spotted in these areas, and we conclude by discussing the most promising future avenues on attention as an explanation.

2021

pdf bib
FrenLyS: A Tool for the Automatic Simplification of French General Language Texts
Eva Rolin | Quentin Langlois | Patrick Watrin | Thomas François
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Lexical simplification (LS) aims at replacing words considered complex in a sentence by simpler equivalents. In this paper, we present the first automatic LS service for French, FrenLys, which offers different techniques to generate, select and rank substitutes. The paper describes the different methods proposed by our tool, which includes both classical approaches (e.g. generation of candidates from lexical resources, frequency filter, etc.) and more innovative approaches such as the exploitation of CamemBERT, a model for French based on the RoBERTa architecture. To evaluate the different methods, a new evaluation dataset for French is introduced.

2016

pdf bib
CENTAL at SemEval-2016 Task 12: a linguistically fed CRF model for medical and temporal information extraction
Charlotte Hansart | Damien De Meyere | Patrick Watrin | André Bittar | Cédrick Fairon
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2014

pdf bib
FLELex: a graded lexical resource for French foreign learners
Thomas François | Nùria Gala | Patrick Watrin | Cédrick Fairon
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present FLELex, the first graded lexicon for French as a foreign language (FFL) that reports word frequencies by difficulty level (according to the CEFR scale). It has been obtained from a tagged corpus of 777,000 words from available textbooks and simplified readers intended for FFL learners. Our goal is to freely provide this resource to the community to be used for a variety of purposes going from the assessment of the lexical difficulty of a text, to the selection of simpler words within text simplification systems, and also as a dictionary in assistive tools for writing.

2012

pdf bib
Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing
Matthieu Constant | Anthony Sigogne | Patrick Watrin
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Extraction of unmarked quotations in Newspapers
Stéphanie Weiser | Patrick Watrin
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents work in progress to automatically extract quotation sentences from newspaper articles. The focus is the extraction and annotation of unmarked quotation sentences. A linguistic study shows that unmarked quotation sentences can be formalised into 16 patterns that can be used to develop an extraction grammar. The question of unmarked quotation boundaries identification is also raised as they are often ambiguous. An annotation scheme allowing to describe all the elements that can take place in a quotation sentence is defined. This paper presents the creation of two resources necessary to our system. A dictionary of verbs introducing quotations has been automatically built using a grammar of marked quotations sentences to identify the verbs able to introduce quotations. A grammar formalising the patterns of unmarked quotation sentences ― using the tool Unitex, based on finite state machines ― has been developed. A short experiment has been performed on two patterns and shows some promising results.

pdf bib
La reconnaissance des mots composés à l’épreuve de l’analyse syntaxique et vice-versa : évaluation de deux stratégies discriminantes (Recognition of Compound Words Tested against Parsing and Vice-versa : Evaluation of Two Discriminative Approaches) [in French]
Matthieu Constant | Anthony Sigogne | Patrick Watrin
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

2011

pdf bib
On the Contribution of MWE-based Features to a Readability Formula for French as a Foreign Language
Thomas François | Patrick Watrin
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
An N-gram Frequency Database Reference to Handle MWE Extraction in NLP Applications
Patrick Watrin | Thomas François
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

pdf bib
Temporal Expressions Extraction in SMS messages
Stéphanie Weiser | Louis-Amélie Cougnon | Patrick Watrin
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition

pdf bib
Quel apport des unités polylexicales dans une formule de lisibilité pour le français langue étrangère (What is the contribution of multi-word expressions in a readability formula for the French as a foreign language)
Thomas François | Patrick Watrin
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Cette étude envisage l’emploi des unités polylexicales (UPs) comme prédicteurs dans une formule de lisibilité pour le français langue étrangère. À l’aide d’un extracteur d’UPs combinant une approche statistique à un filtre linguistique, nous définissons six variables qui prennent en compte la densité et la probabilité des UPs nominales, mais aussi leur structure interne. Nos expérimentations concluent à un faible pouvoir prédictif de ces six variables et révèlent qu’une simple approche basée sur la probabilité moyenne des n-grammes des textes est plus efficace.

2010

pdf bib
Partial Parsing of Spontaneous Spoken French
Olivier Blanc | Matthieu Constant | Anne Dister | Patrick Watrin
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes the process and the resources used to automatically annotate a French corpus of spontaneous speech transcriptions in super-chunks. Super-chunks are enhanced chunks that can contain lexical multiword units. This partial parsing is based on a preprocessing stage of the spoken data that consists in reformatting and tagging utterances that break the syntactic structure of the text, such as disfluencies. Spoken specificities were formalized thanks to a systematic linguistic study of a 40-hour-long speech transcription corpus. The chunker uses large-coverage and fine-grained language resources for general written language that have been augmented with resources specific to spoken French. It consists in iteratively applying finite-state lexical and syntactic resources and outputing a finite automaton representing all possible chunk analyses. The best path is then selected thanks to a hybrid disambiguation stage. We show that our system reaches scores that are comparable with state-of-the-art results in the field.

2007

pdf bib
Segmentation en super-chunks
Olivier Blanc | Matthieu Constant | Patrick Watrin
Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Depuis l’analyseur développé par Harris à la fin des années 50, les unités polylexicales ont peu à peu été intégrées aux analyseurs syntaxiques. Cependant, pour la plupart, elles sont encore restreintes aux mots composés qui sont plus stables et moins nombreux. Toutefois, la langue est remplie d’expressions semi-figées qui forment également des unités sémantiques : les expressions adverbiales et les collocations. De même que pour les mots composés traditionnels, l’identification de ces structures limite la complexité combinatoire induite par l’ambiguïté lexicale. Dans cet article, nous détaillons une expérience qui intègre ces notions dans un processus de segmentation en super-chunks, préalable à l’analyse syntaxique. Nous montrons que notre chunker, développé pour le français, atteint une précision et un rappel de 92,9 % et 98,7 %, respectivement. Par ailleurs, les unités polylexicales réalisent 36,6 % des attachements internes aux constituants nominaux et prépositionnels.

2006

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Conférences invitées
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Conférences invitées

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Posters
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Tutoriels
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Tutoriels

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues

pdf bib
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues (Posters)
Piet Mertens | Cédrick Fairon | Anne Dister | Patrick Watrin
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues (Posters)