Matthieu Dubois
2025
MOSAIC: Multiple Observers Spotting AI Content
Matthieu Dubois
|
François Yvon
|
Pablo Piantanida
Findings of the Association for Computational Linguistics: ACL 2025
The dissemination of Large Language Models (LLMs), trained at scale, and endowed with powerful text-generating abilities, has made it easier for all to produce harmful, toxic, faked or forged content. In response, various proposals have been made to automatically discriminate artificially generated from human-written texts, typically framing the problem as a binary classification problem. Early approaches evaluate an input document with a well-chosen detector LLM, assuming that low-perplexity scores reliably signal machine-made content. More recent systems instead consider two LLMs and compare their probability distributions over the document to further discriminate when perplexity alone cannot. However, using a fixed pair of models can induce brittleness in performance. We extend these approaches to the ensembling of several LLMs and derive a new, theoretically grounded approach to combine their respective strengths. Our experiments, using a variety of generator LLMs, suggest that this approach effectively harnesses each model’s capabilities, leading to strong detection performance on a variety of domains.
MOSAIC at GENAI Detection Task 3 : Zero-Shot Detection Using an Ensemble of Models
Matthieu Dubois
|
François Yvon
|
Pablo Piantanida
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
MOSAIC introduces a new ensemble approach that combines several detector models to spot AI-generated texts. The method enhances the reliability of detection by integrating insights from multiple models, thus addressing the limitations of using a single detector model which often results in performance brittleness. This approach also involves using a theoretically grounded algorithm to minimize the worst-case expected encoding size across models, thereby optimizing the detection process. In this submission, we report evaluation results on the RAID benchmark, a comprehensive English-centric testbed for machine-generated texts. These results were obtained in the context of the “Cross-domain Machine-Generated Text Detection” shared task. We show that our model can be competitive for a variety of domains and generator models, but that it can be challenged by adversarial attacks and by changes in the text generation strategy.
MOSAIC : Mélange d’experts pour la détection de textes artificiels
Matthieu Dubois
|
Pablo Piantanida
|
François Yvon
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : articles scientifiques originaux
La diffusion auprès du grand public de grands modèles de langue facilite la production de contenus nuisibles, médisants, malhonnêtes ou falsifiés. En réponse, plusieurs solutions ont été proposées pour identifier les textes ainsi produits, en traitant le problème comme une tâche de classification binaire. Les premières approches reposent sur l’analyse d’un document par un modèle détecteur, avec l’hypothèse qu’un faible score de perplexité indique que le contenu est artificiel. Des méthodes plus récentes proposent de comparer les distributions de probabilité calculées par deux modèles. Cependant, s’appuyer sur une paire fixe de modèles peut fragiliser les performances. Nous étendons ces méthodes en combinant plusieurs modèles et en développant une approche théoriquement fondée pour exploiter au mieux chacun d’entre eux.