The Challenge of Identifying the Origin of Black-Box Large Language Models

Ziqing Yang; Yixin Wu; Yun Shen; Wei Dai; Michael Backes; Yang Zhang

doi:10.18653/v1/2026.privatenlp-main.2

The Challenge of Identifying the Origin of Black-Box Large Language Models

Ziqing Yang, Yixin Wu, Yun Shen, Wei Dai, Michael Backes, Yang Zhang

Abstract

The tremendous commercial potential of large language models (LLMs) has heightened concerns over their unauthorized use. To address this, we focus on the task of identifying the origin of black-box LLMs. We further propose PlugAE, an effective and efficient identification method that proactively leverages LLM-specific adversarial embeddings and allows users to customize copyright tokens on a targeted query set. Extensive experiments demonstrate that PlugAE outperforms both state-of-the-art model watermarking and fingerprinting methods in accuracy and robustness. We further analyze its stealthiness and reliability from three complementary perspectives and conduct ablation studies under various configurations, confirming its practicality for real-world misuse detection.

Anthology ID:: 2026.privatenlp-main.2
Volume:: Proceedings of the Seventh Workshop on Privacy in Natural Language Processing
Month:: July
Year:: 2026
Address:: San Diego, California
Editors:: Ivan Habernal, Sepideh Ghanavati, Sara Haghighi, Krithika Ramesh, Timour Igamberdiev, Shomir Wilson
Venues:: PrivateNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7–25
Language:
URL:: https://aclanthology.org/2026.privatenlp-main.2/
DOI:: 10.18653/v1/2026.privatenlp-main.2
Bibkey:
Cite (ACL):: Ziqing Yang, Yixin Wu, Yun Shen, Wei Dai, Michael Backes, and Yang Zhang. 2026. The Challenge of Identifying the Origin of Black-Box Large Language Models. In Proceedings of the Seventh Workshop on Privacy in Natural Language Processing, pages 7–25, San Diego, California. Association for Computational Linguistics.
Cite (Informal):: The Challenge of Identifying the Origin of Black-Box Large Language Models (Yang et al., PrivateNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.privatenlp-main.2.pdf

PDF Cite Search Fix data