Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis

Hanxi Guo; Siyuan Cheng; Xiaolong Jin; Zhuo Zhang; Guangyu Shen; Kaiyuan Zhang; Shengwei An; Guanhong Tao; Xiangyu Zhang

doi:10.18653/v1/2025.emnlp-main.1265

Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis

Hanxi Guo, Siyuan Cheng, Xiaolong Jin, Zhuo Zhang, Guangyu Shen, Kaiyuan Zhang, Shengwei An, Guanhong Tao, Xiangyu Zhang

Abstract

With the increasing capabilities of Large Language Models (LLMs), the proliferation of AI-generated texts has become a serious concern. Given the diverse range of organizations providing LLMs, it is crucial for governments and third-party entities to identify the origin LLM of a given AI-generated text to enable accurate mitigation of potential misuse and infringement. However, existing detection methods, primarily designed to distinguish between human-generated and LLM-generated texts, often fail to accurately identify the origin LLM due to the high similarity of AI-generated texts from different LLMs. In this paper, we propose a novel black-box AI-generated text origin detection method, dubbed Profiler, which accurately predicts the origin of an input text by extracting distinct context inference patterns through calculating and analyzing novel context losses between the surrogate model’s output logits and the adjacent input context. Extensive experimental results show that Profiler outperforms 10 state-of-the-art baselines, achieving more than a 25% increase in AUC score on average across both natural language and code datasets when evaluated against five of the latest commercial LLMs under both in-distribution and out-of-distribution settings.

Anthology ID:: 2025.emnlp-main.1265
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24892–24912
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1265/
DOI:: 10.18653/v1/2025.emnlp-main.1265
Bibkey:
Cite (ACL):: Hanxi Guo, Siyuan Cheng, Xiaolong Jin, Zhuo Zhang, Guangyu Shen, Kaiyuan Zhang, Shengwei An, Guanhong Tao, and Xiangyu Zhang. 2025. Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 24892–24912, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis (Guo et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1265.pdf
Checklist:: 2025.emnlp-main.1265.checklist.pdf

PDF Cite Search Checklist Fix data