Marine Schmitt


pdf bib
Démonstrateur en-ligne du projet ANR PARSEME-FR sur les expressions polylexicales (On-line demonstrator of the PARSEME-FR project on multiword expressions)
Marine Schmitt | Elise Moreau | Mathieu Constant | Agata Savary
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume IV : Démonstrations

Nous présentons le démonstrateur en-ligne du projet ANR PARSEME-FR dédié aux expressions polylexicales. Il inclut différents outils d’identification de telles expressions et un outil d’exploration des ressources linguistiques de ce projet.

pdf bib
Neural Lemmatization of Multiword Expressions
Marine Schmitt | Mathieu Constant
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)

This article focuses on the lemmatization of multiword expressions (MWEs). We propose a deep encoder-decoder architecture generating for every MWE word its corresponding part in the lemma, based on the internal context of the MWE. The encoder relies on recurrent networks based on (1) the character sequence of the individual words to capture their morphological properties, and (2) the word sequence of the MWE to capture lexical and syntactic properties. The decoder in charge of generating the corresponding part of the lemma for each word of the MWE is based on a classical character-level attention-based recurrent model. Our model is evaluated for Italian, French, Polish and Portuguese and shows good performances except for Polish.