Synthetic Text Detection in the Age of Large Language Models: Watermark vs. Automatic Detection

Adaku Uchendu

Synthetic Text Detection in the Age of Large Language Models: Watermark vs. Automatic Detection

Abstract

Given the ubiquitous nature of Large Language Models (LLMs) and its impressive capabilities, malicious uses of this technology to generate harmful content have been observed. Thus, to mitigate this serious security risk LLMs pose, many researchers have proposed two techniques for detecting synthetic texts generated from LLMs - watermark and automatic detection. The idea with watermarking LLMs involves infusing generated content with algorithmically-identifiable patterns during generation. This makes accurate synthetic text detection achievable with watermark detection. While, for automatic detection, the focus is on using statistical and linguistic cues to reveal authorship of texts as human or LLM. Currently, both types of synthetic text detectors achieve state-of-the-art performance, however, the better detector is still unknown. To ascertain the better detection method, we evaluate each method on their performance on both unperturbed and perturbed (i.e., adversarially manipulated texts) data. We perform a comprehensive study across six different sizes of Qwen2.5 models, six watermark techniques and detectors, two automatic detectors, three authorship obfuscation methods for different levels of syntactic changes, and two datasets of different text lengths. Our results suggest that there is no detector that consistently outperforms on all scenarios. However, we observe that the (1) automatic detectors are better for short synthetic text detection; and (2) watermark detectors perform better defending against the word-level attack implemented.

Anthology ID:: 2026.acl-industry.9
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 113–126
Language:
URL:: https://aclanthology.org/2026.acl-industry.9/
DOI:
Bibkey:
Cite (ACL):: Adaku Uchendu. 2026. Synthetic Text Detection in the Age of Large Language Models: Watermark vs. Automatic Detection. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 113–126, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Synthetic Text Detection in the Age of Large Language Models: Watermark vs. Automatic Detection (Uchendu, ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-industry.9.pdf

PDF Cite Search Fix data