Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models

Yuxuan Guo, Zhiliang Tian, Yiping Song, Tianlun Liu, Liang Ding, Dongsheng Li


Abstract
Watermarking enables people to determine whether the text is generated by a specific model. It injects a unique signature based on the “green-red” list that can be tracked during detection, where the words in green lists are encouraged to be generated. Recent researchers propose to fix the green/red lists or increase the proportion of green tokens to defend against paraphrasing attacks. However, these methods cause degradation of text quality due to semantic disparities between the watermarked text and the unwatermarked text. In this paper, we propose a semantic-aware watermark method that considers contexts to generate a semantic-aware key to split a semantically balanced green/red list for watermark injection. The semantic balanced list reduces the performance drop due to adding bias on green lists. To defend against paraphrasing attacks, we generate the watermark key considering the semantics of contexts via locally sensitive hashing. To improve the text quality, we propose to split green/red lists considering semantics to enable the green list to cover almost all semantics. We also dynamically adapt the bias to balance text quality and robustness. The experiments show our advantages in both robustness and text quality comparable to existing baselines.
Anthology ID:
2024.emnlp-main.1260
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22633–22646
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1260
DOI:
Bibkey:
Cite (ACL):
Yuxuan Guo, Zhiliang Tian, Yiping Song, Tianlun Liu, Liang Ding, and Dongsheng Li. 2024. Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 22633–22646, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models (Guo et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1260.pdf