AudioStealer: Extracting Audio Prompts via Shapley Value-Guided Query Search

Yingbin Jin; Xingjian Du; Hanjun Luo; Zihao Wang; Haibo Hu; XiaoFeng Wang; Xinfeng Li

AudioStealer: Extracting Audio Prompts via Shapley Value-Guided Query Search

Yingbin Jin, Xingjian Du, Hanjun Luo, Zihao Wang, Haibo Hu, XiaoFeng Wang, Xinfeng Li

Abstract

As text-to-music models gain widespread adoption, the prompts used to guide these systems have become valuable intellectual property. This shift has given rise to a new form of attack: prompt stealing, aiming to reconstruct the high-value prompts that guide the music generation. However, unlike prior work in text and image generation, prompt stealing in text-to-music systems faces unique challenges due to the entangled and diffuse nature of semantic representations in audio, which complicates the decoupling of specific textual tokens from acoustic outputs. To address these challenges, we present AudioStealer, the first targeted study of prompt inversion in the audio domain. AudioStealer operates via a two-stage black-box attack framework: first, a heuristic search guided by audio-language embeddings identifies initial candidates; then, these candidates are refined using a game-theoretic strategy based on Shapley value estimation to attribute precise semantic contributions. Our method requires no direct access to the target model and relies solely on a shadow model, making it broadly applicable. Through extensive experiments, we demonstrate that AudioStealer recovers prompts with high textual consistency to the ground truth, while the regenerated audio maintains strong perceptual similarity to the target recordings. These results expose critical vulnerabilities in the text-to-audio market ecosystem and underscore the urgent need for intellectual property protections in generative audio technologies.

Anthology ID:: 2026.findings-acl.688
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14052–14067
Language:
URL:: https://aclanthology.org/2026.findings-acl.688/
DOI:
Bibkey:
Cite (ACL):: Yingbin Jin, Xingjian Du, Hanjun Luo, Zihao Wang, Haibo Hu, XiaoFeng Wang, and Xinfeng Li. 2026. AudioStealer: Extracting Audio Prompts via Shapley Value-Guided Query Search. In Findings of the Association for Computational Linguistics: ACL 2026, pages 14052–14067, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: AudioStealer: Extracting Audio Prompts via Shapley Value-Guided Query Search (Jin et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.688.pdf
Checklist:: 2026.findings-acl.688.checklist.pdf

PDF Cite Search Checklist Fix data