Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, Gholamreza Haffari


Abstract
Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we conduct unsupervised domain adaptation and multi-victim ensemble to showing that attackers could potentially surpass victims, which is beyond previous understanding of model extraction. Extensive experiments on both benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models on transferred domains. We consider our work as a milestone in the research of imitation attack, especially on NLP APIs, as the superior performance could influence the defense or even publishing strategy of API providers.
Anthology ID:
2022.coling-1.251
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
2849–2860
Language:
URL:
https://aclanthology.org/2022.coling-1.251
DOI:
Bibkey:
Cite (ACL):
Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, and Gholamreza Haffari. 2022. Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2849–2860, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs (Xu et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.251.pdf
Data
IMDb Movie ReviewsSSTWMT 2014