SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

Abe Hou; Jingyu Zhang; Tianxing He; Yichen Wang; Yung-Sung Chuang; Hongwei Wang; Lingfeng Shen; Benjamin Van Durme; Daniel Khashabi; Yulia Tsvetkov

doi:10.18653/v1/2024.naacl-long.226

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

Abe Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov

Abstract

Existing watermarked generation algorithms employ token-level designs and therefore, are vulnerable to paraphrase attacks. To address this issue, we introduce watermarking on the semantic representation of sentences. We propose SemStamp, a robust sentence-level semantic watermarking algorithm that uses locality-sensitive hashing (LSH) to partition the semantic space of sentences. The algorithm encodes and LSH-hashes a candidate sentence generated by a language model, and conducts rejection sampling until the sampled sentence falls in watermarked partitions in the semantic embedding space. To test the paraphrastic robustness of watermarking algorithms, we propose a “bigram paraphrase” attack that produces paraphrases with small bigram overlap with the original sentence. This attack is shown to be effective against existing token-level watermark algorithms, while posing only minor degradations to SemStamp. Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on various paraphrasers and domains, but also better at preserving the quality of generation.

Anthology ID:: 2024.naacl-long.226
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4067–4082
Language:
URL:: https://aclanthology.org/2024.naacl-long.226
DOI:: 10.18653/v1/2024.naacl-long.226
Bibkey:
Cite (ACL):: Abe Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, and Yulia Tsvetkov. 2024. SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4067–4082, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation (Hou et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.226.pdf

PDF Cite Search