Cross-Domain Detection of GPT-2-Generated Technical Text

Juan Rodriguez, Todd Hay, David Gros, Zain Shamsi, Ravi Srinivasan


Abstract
Machine-generated text presents a potential threat not only to the public sphere, but also to the scientific enterprise, whereby genuine research is undermined by convincing, synthetic text. In this paper we examine the problem of detecting GPT-2-generated technical research text. We first consider the realistic scenario where the defender does not have full information about the adversary’s text generation pipeline, but is able to label small amounts of in-domain genuine and synthetic text in order to adapt to the target distribution. Even in the extreme scenario of adapting a physics-domain detector to a biomedical detector, we find that only a few hundred labels are sufficient for good performance. Finally, we show that paragraph-level detectors can be used to detect the tampering of full-length documents under a variety of threat models.
Anthology ID:
2022.naacl-main.88
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1213–1233
Language:
URL:
https://aclanthology.org/2022.naacl-main.88
DOI:
10.18653/v1/2022.naacl-main.88
Bibkey:
Cite (ACL):
Juan Rodriguez, Todd Hay, David Gros, Zain Shamsi, and Ravi Srinivasan. 2022. Cross-Domain Detection of GPT-2-Generated Technical Text. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1213–1233, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Cross-Domain Detection of GPT-2-Generated Technical Text (Rodriguez et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.88.pdf
Code
 ciads-ut/cross-domain-detection-gpt-2
Data
S2ORCSemantic ScholarWebText