Claire Nédellec

Also published as: Claire Nėdellec


2023

pdf bib
Exploitation de plongements de graphes pour l’extraction de relations biomédicales
Anfu Tang | Robert Bossy | Louise Deléger | Claire Nédellec | Pierre Zweigenbaum
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux -- articles longs

L’intégration de connaissances externes dans les modèles neuronaux est très étudiée pour améliorer les performances des modèles de langue pré-entraînés, notamment en domaine biomédical. Dans cet article, nous explorons la contribution de plongements de bases de connaissances à une tâche d’extraction de relations. Pour deux mentions d’entités candidates dans un texte, nous faisons l’hypothèse que la connaissance de relations entre elles, issue d’une base de connaissances (BC) externe, aide à prédire l’existence d’une relation dans le texte, y compris lorsque les relations de BC sont différentes de celles du texte. Notre approche consiste à calculer des plongements du graphe de BC et à estimer la possibilité pour chaque paire d’entité du texte qu’elle soit reliée par une relation de BC. Les expériences menées sur trois tâches d’extraction de relations en domaine biomédical montrent que notre méthode surpasse le modèle PubMedBERT de base et donne des performances comparables aux méthodes de l’état de l’art.

2020

pdf bib
Handling Entity Normalization with no Annotated Corpus: Weakly Supervised Methods Based on Distributional Representation and Ontological Information
Arnaud Ferré | Robert Bossy | Mouhamadou Ba | Louise Deléger | Thomas Lavergne | Pierre Zweigenbaum | Claire Nédellec
Proceedings of the Twelfth Language Resources and Evaluation Conference

Entity normalization (or entity linking) is an important subtask of information extraction that links entity mentions in text to categories or concepts in a reference vocabulary. Machine learning based normalization methods have good adaptability as long as they have enough training data per reference with a sufficient quality. Distributional representations are commonly used because of their capacity to handle different expressions with similar meanings. However, in specific technical and scientific domains, the small amount of training data and the relatively small size of specialized corpora remain major challenges. Recently, the machine learning-based CONTES method has addressed these challenges for reference vocabularies that are ontologies, as is often the case in life sciences and biomedical domains. And yet, its performance is dependent on manually annotated corpus. Furthermore, like other machine learning based methods, parametrization remains tricky. We propose a new approach to address the scarcity of training data that extends the CONTES method by corpus selection, pre-processing and weak supervision strategies, which can yield high-performance results without any manually annotated examples. We also study which hyperparameters are most influential, with sometimes different patterns compared to previous work. The results show that our approach significantly improves accuracy and outperforms previous state-of-the-art algorithms.

2019

pdf bib
Bacteria Biotope at BioNLP Open Shared Tasks 2019
Robert Bossy | Louise Deléger | Estelle Chaix | Mouhamadou Ba | Claire Nédellec
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks

This paper presents the fourth edition of the Bacteria Biotope task at BioNLP Open Shared Tasks 2019. The task focuses on the extraction of the locations and phenotypes of microorganisms from PubMed abstracts and full-text excerpts, and the characterization of these entities with respect to reference knowledge sources (NCBI taxonomy, OntoBiotope ontology). The task is motivated by the importance of the knowledge on biodiversity for fundamental research and applications in microbiology. The paper describes the different proposed subtasks, the corpus characteristics, and the challenge organization. We also provide an analysis of the results obtained by participants, and inspect the evolution of the results since the last edition in 2016.

2018

pdf bib
Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Arnaud Ferré | Louise Deléger | Pierre Zweigenbaum | Claire Nédellec
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Representation of complex terms in a vector space structured by an ontology for a normalization task
Arnaud Ferré | Pierre Zweigenbaum | Claire Nédellec
BioNLP 2017

We propose in this paper a semi-supervised method for labeling terms of texts with concepts of a domain ontology. The method generates continuous vector representations of complex terms in a semantic space structured by the ontology. The proposed method relies on a distributional semantics approach, which generates initial vectors for each of the extracted terms. Then these vectors are embedded in the vector space constructed from the structure of the ontology. This embedding is carried out by training a linear model. Finally, we apply a distance calculation to determine the proximity between vectors of terms and vectors of concepts and thus to assign ontology labels to terms. We have evaluated the quality of these representations for a normalization task by using the concepts of an ontology as semantic labels. Normalization of terms is an important step to extract a part of the information containing in texts, but the vector space generated might find other applications. The performance of this method is comparable to that of the state of the art for this task of standardization, opening up encouraging prospects.

2016

pdf bib
Proceedings of the 4th BioNLP Shared Task Workshop
Claire Nėdellec | Robert Bossy | Jin-Dong Kim
Proceedings of the 4th BioNLP Shared Task Workshop

pdf bib
Overview of the Regulatory Network of Plant Seed Development (SeeDev) Task at the BioNLP Shared Task 2016.
Estelle Chaix | Bertrand Dubreucq | Abdelhak Fatihi | Dialekti Valsamou | Robert Bossy | Mouhamadou Ba | Louise Deléger | Pierre Zweigenbaum | Philippe Bessières | Loic Lepiniec | Claire Nédellec
Proceedings of the 4th BioNLP Shared Task Workshop

pdf bib
Overview of the Bacteria Biotope Task at BioNLP Shared Task 2016
Louise Deléger | Robert Bossy | Estelle Chaix | Mouhamadou Ba | Arnaud Ferré | Philippe Bessières | Claire Nédellec
Proceedings of the 4th BioNLP Shared Task Workshop

2013

pdf bib
Proceedings of the BioNLP Shared Task 2013 Workshop
Claire Nédellec | Robert Bossy | Jin-Dong Kim | Jung-jae Kim | Tomoko Ohta | Sampo Pyysalo | Pierre Zweigenbaum
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
Overview of BioNLP Shared Task 2013
Claire Nédellec | Robert Bossy | Jin-Dong Kim | Jung-jae Kim | Tomoko Ohta | Sampo Pyysalo | Pierre Zweigenbaum
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
BioNLP Shared Task 2013 – An overview of the Genic Regulation Network Task
Robert Bossy | Philippe Bessières | Claire Nédellec
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
BioNLP shared Task 2013 – An Overview of the Bacteria Biotope Task
Robert Bossy | Wiktoria Golik | Zorana Ratkovic | Philippe Bessières | Claire Nédellec
Proceedings of the BioNLP Shared Task 2013 Workshop

2012

pdf bib
AlvisAE: a collaborative Web text annotation editor for knowledge acquisition
Frédéric Papazian | Robert Bossy | Claire Nédellec
Proceedings of the Sixth Linguistic Annotation Workshop

2011

pdf bib
BioNLP Shared Task 2011 - Bacteria Biotope
Robert Bossy | Julien Jourde | Philippe Bessières | Maarten van de Guchte | Claire Nédellec
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
BioNLP 2011 Task Bacteria Biotope – The Alvis system
Zorana Ratkovic | Wiktoria Golik | Pierre Warnier | Philippe Veber | Claire Nédellec
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Sentence Filtering for BioNLP: Searching for Renaming Acts
Pierre Warnier | Claire Nédellec
Proceedings of BioNLP Shared Task 2011 Workshop

2010

pdf bib
Named and Specific Entity Detection in Varied Data: The Quæro Named Entity Baseline Evaluation
Olivier Galibert | Ludovic Quintard | Sophie Rosset | Pierre Zweigenbaum | Claire Nédellec | Sophie Aubin | Laurent Gillard | Jean-Pierre Raysz | Delphine Pois | Xavier Tannier | Louise Deléger | Dominique Laurent
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The Quæro program that promotes research and industrial innovation on technologies for automatic analysis and classification of multimedia and multilingual documents. Within its context a set of evaluations of Named Entity recognition systems was held in 2009. Four tasks were defined. The first two concerned traditional named entities in French broadcast news for one (a rerun of ESTER 2) and of OCR-ed old newspapers for the other. The third was a gene and protein name extraction in medical abstracts. The last one was the detection of references in patents. Four different partners participated, giving a total of 16 systems. We provide a synthetic descriptions of all of them classifying them by the main approaches chosen (resource-based, rules-based or statistical), without forgetting the fact that any modern system is at some point hybrid. The metric (the relatively standard Slot Error Rate) and the results are also presented and discussed. Finally, a process is ongoing with preliminary acceptance of the partners to ensure the availability for the community of all the corpora used with the exception of the non-Quæro produced ESTER 2 one.

2004

pdf bib
Event-Based Information Extraction for the Biomedical Domain: the Caderige Project
Erick Alphonse | Sophie Aubin | Philippe Bessières | Gilles Bisson | Thierry Hamon | Sandrine Lagarrigue | Adeline Nazarenko | Alain-Pierre Manine | Claire Nédellec | Mohamed Ould Abdel Vetah | Thierry Poibeau | Davy Weissenbacher
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)