Towards a Robust Detection of Language Model-Generated Text: Is ChatGPT that easy to detect?

Wissam Antoun; Virginie Mouilleron; Benoît Sagot; Djamé Seddah

Towards a Robust Detection of Language Model-Generated Text: Is ChatGPT that easy to detect?

Wissam Antoun, Virginie Mouilleron, Benoît Sagot, Djamé Seddah

Abstract

Recent advances in natural language processing (NLP) have led to the development of large language models (LLMs) such as ChatGPT. This paper proposes a methodology for developing and evaluating ChatGPT detectors for French text, with a focus on investigating their robustness on out-of-domain data and against common attack schemes. The proposed method involves translating an English dataset into French and training a classifier on the translated data. Results show that the detectors can effectively detect ChatGPT-generated text, with a degree of robustness against basic attack techniques in in-domain settings. However, vulnerabilities are evident in out-of-domain contexts, highlighting the challenge of detecting adversarial text. The study emphasizes caution when applying in-domain testing results to a wider variety of content. We provide our translated datasets and models as open-source resources.

Anthology ID:: 2023.jeptalnrecital-long.2
Volume:: Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux -- articles longs
Month:: 6
Year:: 2023
Address:: Paris, France
Editors:: Christophe Servan, Anne Vilnat
Venue:: JEP/TALN/RECITAL
SIG:
Publisher:: ATALA
Note:
Pages:: 14–27
Language:
URL:: https://aclanthology.org/2023.jeptalnrecital-long.2/
DOI:
Bibkey:
Cite (ACL):: Wissam Antoun, Virginie Mouilleron, Benoît Sagot, and Djamé Seddah. 2023. Towards a Robust Detection of Language Model-Generated Text: Is ChatGPT that easy to detect?. In Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux -- articles longs, pages 14–27, Paris, France. ATALA.
Cite (Informal):: Towards a Robust Detection of Language Model-Generated Text: Is ChatGPT that easy to detect? (Antoun et al., JEP/TALN/RECITAL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.jeptalnrecital-long.2.pdf

PDF Cite Search Fix data