Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Yang Xu, Yu Wang, Hao An, Zhichen Liu, Yongyuan Li


Abstract
Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model’s capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likelihood for the human-model text detection task. We propose a detection procedure with two classification methods, supervised and heuristic-based, respectively, which results in competitive performances with previous zero-shot detection methods and a new state-of-the-art on short-text detection. Our method can also reveal subtle differences between human and model languages, which find theoretical roots in psycholinguistics studies.
Anthology ID:
2024.emnlp-main.564
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10108–10121
Language:
URL:
https://aclanthology.org/2024.emnlp-main.564
DOI:
10.18653/v1/2024.emnlp-main.564
Bibkey:
Cite (ACL):
Yang Xu, Yu Wang, Hao An, Zhichen Liu, and Yongyuan Li. 2024. Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10108–10121, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood (Xu et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.564.pdf
Software:
 2024.emnlp-main.564.software.zip
Data:
 2024.emnlp-main.564.data.zip