Yijian Lu
2024
MarkLLM: An Open-Source Toolkit for LLM Watermarking
Leyi Pan
|
Aiwei Liu
|
Zhiwei He
|
Zitian Gao
|
Xuandong Zhao
|
Yijian Lu
|
Binglin Zhou
|
Shuliang Liu
|
Xuming Hu
|
Lijie Wen
|
Irwin King
|
Philip S. Yu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Watermarking for Large Language Models (LLMs), which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of LLMs. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community to easily understand, implement and evaluate the latest advancements. To address these issues, we introduce MarkLLM, an open-source toolkit for LLM watermarking. MarkLLM offers a unified and extensible framework for implementing LLM watermarking algorithms, while providing user-friendly interfaces to ensure ease of access. Furthermore, it enhances understanding by supporting automatic visualization of the underlying mechanisms of these algorithms. For evaluation, MarkLLM offers a comprehensive suite of 12 tools spanning three perspectives, along with two types of automated evaluation pipelines. Through MarkLLM, we aim to support researchers while improving the comprehension and involvement of the general public in LLM watermarking technology, fostering consensus and driving further advancements in research and application. Our code is available at https://github.com/THU-BPM/MarkLLM.
An Entropy-based Text Watermarking Detection Method
Yijian Lu
|
Aiwei Liu
|
Dianzhi Yu
|
Jingjing Li
|
Irwin King
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Text watermarking algorithms for large language models (LLMs) can effectively identify machine-generated texts by embedding and detecting hidden features in the text. Although the current text watermarking algorithms perform well in most high-entropy scenarios, its performance in low-entropy scenarios still needs to be improved. In this work, we opine that the influence of token entropy should be fully considered in the watermark detection process, i.e., the weight of each token during watermark detection should be customized according to its entropy, rather than setting the weights of all tokens to the same value as in previous methods. Specifically, we propose Entropy-based Text Watermarking Detection (EWD) that gives higher-entropy tokens higher influence weights during watermark detection, so as to better reflect the degree of watermarking. Furthermore, the proposed detection process is training-free and fully automated. From the experiments, we demonstrate that our EWD can achieve better detection performance in low-entropy scenarios, and our method is also general and can be applied to texts with different entropy distributions. Our code and data is available. Additionally, our algorithm could be accessed through MarkLLM (CITATION).
Search
Co-authors
- Aiwei Liu 2
- Irwin King 2
- Leyi Pan 1
- Zhiwei He 1
- Zitian Gao 1
- show all...