Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code

Jungin Kim; Shinwoo Park; Yo-Sub Han

Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code

Abstract

Identifying LLM-generated code through watermarking poses a challenge in preserving functional correctness. Previous methods rely on the assumption that watermarking high-entropy tokens effectively maintains output quality. Our analysis reveals a fundamental limitation of this assumption: syntax-critical tokens such as keywords often exhibit the highest entropy, making existing approaches vulnerable to logic corruption. We present STONE, a syntax-aware watermarking method that embeds watermarks only in non-syntactic tokens and preserves code integrity. For rigorous evaluation, we also introduce STEM, a comprehensive metric that balances three critical dimensions: correctness, detectability, and imperceptibility. Across Python, C++, and Java, STONE preserves correctness, sustains strong detectability, and achieves balanced performance with minimal computational overhead. Our implementation is available at https://github.com/inistory/STONE-watermarking.

Anthology ID:: 2026.findings-eacl.207
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3990–4002
Language:
URL:: https://aclanthology.org/2026.findings-eacl.207/
DOI:
Bibkey:
Cite (ACL):: Jungin Kim, Shinwoo Park, and Yo-Sub Han. 2026. Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code. In Findings of the Association for Computational Linguistics: EACL 2026, pages 3990–4002, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code (Kim et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-eacl.207.pdf
Checklist:: 2026.findings-eacl.207.checklist.pdf

PDF Cite Search Checklist Fix data