Denis Béchet


2022

pdf bib
Iterated Dependencies in a Breton treebank and implications for a Categorial Dependency Grammar
Annie Foret | Denis Béchet | Valérie Bellynck
Proceedings of the 4th Celtic Language Technology Workshop within LREC2022

Categorial Dependency Grammars (CDG) are computational grammars for natural language processing, defining dependency structures. They can be viewed as a formal system, where types are attached to words, combining the classical categorial grammars’ elimination rules with valency pairing rules able to define discontinuous (non-projective) dependencies. Algorithms have been proposed to infer grammars in this class from treebanks, with respect to Mel’čuk principles. We consider this approach with experiments on Breton. We focus in particular on ”repeatable dependencies” (iterated) and their patterns. A dependency d is iterated in a dependency structure if some word in this structure governs several other words through dependency d. We illustrate this approach with data in the universal dependencies format and dependency patterns written in Grew (a graph rewriting tool dedicated to applications in natural Language Processing).

2015

pdf bib
CDGFr, un corpus en dépendances non-projectives pour le français
Denis Béchet | Ophélie Lacroix
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Dans le cadre de l’analyse en dépendances du français, le phénomène de la non-projectivité est peu pris en compte, en majeure partie car les donneés sur lesquelles sont entraînés les analyseurs représentent peu ou pas ces cas particuliers. Nous présentons, dans cet article, un nouveau corpus en dépendances pour le français, librement disponible, contenant un nombre substantiel de dépendances non-projectives. Ce corpus permettra d’étudier et de mieux prendre en compte les cas de non-projectivité dans l’analyse du français.

2014

pdf bib
Validation Issues induced by an Automatic Pre-Annotation Mechanism in the Building of Non-projective Dependency Treebanks
Ophélie Lacroix | Denis Béchet
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In order to build large dependency treebanks using the CDG Lab, a grammar-based dependency treebank development tool, an annotator usually has to fill a selection form before parsing. This step is usually necessary because, otherwise, the search space is too big for long sentences and the parser fails to produce at least one solution. With the information given by the annotator on the selection form the parser can produce one or several dependency structures and the annotator can proceed by adding positive or negative annotations on dependencies and launching iteratively the parser until the right dependency structure has been found. However, the selection form is sometimes difficult and long to fill because the annotator must have an idea of the result before parsing. The CDG Lab proposes to replace this form by an automatic pre-annotation mechanism. However, this model introduces some issues during the annotation phase that do not exist when the annotator uses a selection form. The article presents those issues and proposes some modifications of the CDG Lab in order to use effectively the automatic pre-annotation mechanism.

pdf bib
A Three-Step Transition-Based System for Non-Projective Dependency Parsing
Ophélie Lacroix | Denis Béchet
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2012

pdf bib
Calcul des cadres de sous catégorisation des noms déverbaux français (le cas du génitif) (On Computing Subcategorization Frames of French Deverbal Nouns (Case of Genitive)) [in French]
Ramadan Alfared | Denis Béchet | Alexander Dikovsky
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

2003

pdf bib
Remarques et perspectives sur les langages de prégroupe d’ordre 1/2
Denis Béchet | Annie Foret
Actes de la 10ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Cet article traite de l’acquisition automatique des grammaires de Lambek, utilisées pour la modélisation syntaxique des langues. Récemment, des algorithmes ont été proposés dans le modèle d’apprentissage de Gold, pour certaines classes de grammaires catégorielles. En revenche, les grammaires de Lambek rigides ou k-valuées ne sont pas apprenables à partir des chaînes. Nous nous intéressons ici au cas des grammaires de prégroupe. Nous montrons que la classe des grammaires de prégroupe n’est pas apprenable à partir des chaînes, même si on limite fortement l’ordre des types (ordre 1/2) ; notre preuve revient à construire un point limite pour cette classe.

pdf bib
k-Valued Non-Associative Lambek Categorial Grammars are not Learnable from Strings
Denis Béchet | Annie Foret
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Incremental Parsing Of Lambek Calculus Using Proof-Net Interfaces
Denis Béchet
Proceedings of the Eighth International Conference on Parsing Technologies

The paper describes an incremental parsing algorithm for natural languages that uses normalized interfaces of modules of proof-nets. This algorithm produces at each step the different possible partial syntactical analyses of the first words of a sentence. Thus, it can analyze texts on the fly leaving partially analyzed sentences.