Pauline Brunet


2020

pdf bib
Extrinsic Evaluation of French Dependency Parsers on a Specialized Corpus: Comparison of Distributional Thesauri
Ludovic Tanguy | Pauline Brunet | Olivier Ferret
Proceedings of the 12th Language Resources and Evaluation Conference

We present a study in which we compare 11 different French dependency parsers on a specialized corpus (consisting of research articles on NLP from the proceedings of the TALN conference). Due to the lack of a suitable gold standard, we use each of the parsers’ output to generate distributional thesauri using a frequency-based method. We compare these 11 thesauri to assess the impact of choosing a parser over another. We show that, without any reference data, we can still identify relevant subsets among the different parsers. We also show that the similarity we identify between parsers is confirmed on a restricted distributional benchmark.

pdf bib
Which Dependency Parser to Use for Distributional Semantics in a Specialized Domain?
Pauline Brunet | Olivier Ferret | Ludovic Tanguy
Proceedings of the 6th International Workshop on Computational Terminology

We present a study whose objective is to compare several dependency parsers for English applied to a specialized corpus for building distributional count-based models from syntactic dependencies. One of the particularities of this study is to focus on the concepts of the target domain, which mainly occur in documents as multi-terms and must be aligned with the outputs of the parsers. We compare a set of ten parsers in terms of syntactic triplets but also in terms of distributional neighbors extracted from the models built from these triplets, both with and without an external reference concerning the semantic relations between concepts. We show more particularly that some patterns of proximity between these parsers can be observed across our different evaluations, which could give insights for anticipating the performance of a parser for building distributional models from a given corpus

2019

pdf bib
Comparaison qualitative et extrinsèque d’analyseurs syntaxiques du français : confrontation de modèles distributionnels sur un corpus spécialisé (Extrinsic evaluation of French dependency parsers on a specialised corpus : comparison of distributional thesauri )
Ludovic Tanguy | Pauline Brunet | Olivier Ferret
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume I : Articles longs

Nous présentons une étude visant à comparer 11 différents analyseurs en dépendances du français sur un corpus spécialisé (constitué des archives des articles de la conférence TALN). En l’absence de gold standard, nous utilisons chacune des sorties de ces analyseurs pour construire des thésaurus distributionnels en utilisant une méthode à base de fréquence. Nous comparons ces 11 thésaurus afin de proposer un premier aperçu de l’impact du choix d’un analyseur par rapport à un autre.