On the Generalization of Training-based ChatGPT Detection Methods

Han Xu, Jie Ren, Pengfei He, Shenglai Zeng, Yingqian Cui, Amy Liu, Hui Liu, Jiliang Tang


Abstract
Large language models, such as ChatGPT, achieve amazing performance on various language processing tasks. However, they can also be exploited for improper purposes such as plagiarism or misinformation dissemination. Thus, there is an urgent need to detect the texts generated by LLMs. One type of most studied methods trains classification models to distinguish LLM texts from human texts. However, existing studies demonstrate the trained models may suffer from distribution shifts (during test), i.e., they are ineffective to predict the generated texts from unseen language tasks or topics which are not collected during training. In this work, we focus on ChatGPT as a representative model, and we conduct a comprehensive investigation on these methods’ generalization behaviors under distribution shift caused by a wide range of factors, including prompts, text lengths, topics, and language tasks. To achieve this goal, we first collect a new dataset with human and ChatGPT texts, and then we conduct extensive studies on the collected dataset. Our studies unveil insightful findings that provide guidance for future methodologies and data collection strategies for LLM detection.
Anthology ID:
2024.findings-emnlp.424
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7223–7243
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.424
DOI:
Bibkey:
Cite (ACL):
Han Xu, Jie Ren, Pengfei He, Shenglai Zeng, Yingqian Cui, Amy Liu, Hui Liu, and Jiliang Tang. 2024. On the Generalization of Training-based ChatGPT Detection Methods. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 7223–7243, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
On the Generalization of Training-based ChatGPT Detection Methods (Xu et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.424.pdf
Data:
 2024.findings-emnlp.424.data.zip