RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service

Lingyun Xiang, Yufan Zhong, Chengfu Ou, Zhihua Xia, Chunfang Yang, Daojian Zeng, Zhangjie Fu


Abstract
Embedding-as-a-Service (EaaS) has emerged as a critical paradigm for commercializing large language models (LLMs). However, existing backdoor watermarking techniques are fundamentally limited to "zero-bit" detection, which prevents user-level traceability in multi-user EaaS scenarios. To address these limitations, we propose RShield, a multi-bit backdoor watermarking that enables reliable user-level attribution of LLMs for EaaS under model extraction attacks. RShield integrates Reed-Solomon error-correcting codes with orthogonal feature mapping to introduce highly-structured redundancy, constructing fault-tolerant symbol sequences for multi-bit watermark space, thereby staying recoverable even after aggressive extraction noise condition.To mitigate semantic distortion under the interference of noise channel, RShield employs a lightweight Adapter to adaptively inject multi-bit watermarks in the feature space, preserving the quality of EaaS while achieving a user-level traceability.Extensive experiments on four NLP benchmarks demonstrate that RShield efficiently achieves 100% multi-bit watermark recovery and high semantic fidelity under model extraction attacks compared to existing methods, while significantly reducing the degradation of watermarking on downstream task performance.
Anthology ID:
2026.findings-acl.1347
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27014–27028
Language:
URL:
https://aclanthology.org/2026.findings-acl.1347/
DOI:
Bibkey:
Cite (ACL):
Lingyun Xiang, Yufan Zhong, Chengfu Ou, Zhihua Xia, Chunfang Yang, Daojian Zeng, and Zhangjie Fu. 2026. RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service. In Findings of the Association for Computational Linguistics: ACL 2026, pages 27014–27028, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service (Xiang et al., Findings 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.findings-acl.1347.pdf
Checklist:
 2026.findings-acl.1347.checklist.pdf