Pierre Magistry

2025

A corpus-driven description of OV order in Archaic Chinese
Qishen Wu | Santiago Herrera | Pierre Magistry | Sylvain Kahane
Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025)

This paper presents a quantitative study of Object‐Verb (OV) order in Archaic Chinese based on a Universal Dependencies (UD) treebanks. Treating word order as a binary choice (OV vs VO), we train a sparse logistic‐regression classifier that selects the most salient syntactic features needed for an accurate prediction to investigate the specific syntactic contexts allowing OV word order and to identify to what extent do these factors favour this order. The ranked features are understood as interpretable rules, and their coverage and precision as quantitative properties of each rule. The approach confirms earlier qualitative findings (e.g. pronoun object fronting and negation favour OV) and uncovers new contrasts in word order between different reflexive pronouns. It also identifies annotation errors that we corrected in the final analysis, illustrating how the quantitative models, combined with fine-grained corpus analysis, can improve treebank quality. Our study demonstrates that lightweight machine‐learning techniques applied to an existing syntactic resource can reveal fine‐grained patterns in historical word order and this can be reapplied to other languages.

pdf bib abs

La science participative et l’ANR DiLSi
Pierre Magistry | Ilaine Wang
Actes de l'atelier Science Participative pour les Données et Corpus Linguistiques 2025 (ParCol)

Cette communication propose un retour d’expérience sur les interactions entre le projet DiLSi et les communautés de locuteurs du teochew de la diaspora et du tâigí.

pdf bib abs

TinyMentalLLMs Enable Depression Detection in Chinese Social Media Texts
Jinyuan Xu | Tian Lan | Mathieu Valette | Pierre Magistry | Lei Li
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Depression remains a major global mental health concern, bringing a higher risk of suicide and growing social costs tied to mental disorders. Leveraging social media as a valuable source of emotional signals, we identify two limitations in current NLP-based depression detection frameworks: (1) prediction systems often lack clear, user-friendly explanations for predictions in Depression Detection, and (2) the computational and confidentiality demands of LLMs are misaligned with the need for dependable, privacy-focused small-scale deployments. To address these challenges, we introduce TinyMentalLLMs (TMLs), a compact framework that offers two key contributions: (a) the construction of a small yet representative dataset through psychology-based textometry, and (b) an efficient fine-tuning strategy centered on multiple aspects of depression. This design improves both accuracy and F1 scores in generative models with 0.5B and 1.5B parameters, consistently yielding over 20% performance gains across datasets. TMLs achieve results on par with, and deliver better text quality than, much larger state-of-the-art models.

Pierre Magistry

2025

2024

2023

2022

2020

2018

2016

2014

2013

2012

2011

2010

2009

Co-authors

Venues