On Log-Loss Scores and (No) Privacy

Abhinav Aggarwal, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier


Abstract
A common metric for assessing the performance of binary classifiers is the Log-Loss score, which is a real number indicating the cross entropy distance between the predicted distribution over the labels and the true distribution (a point distribution defined by the ground truth labels). In this paper, we show that a malicious modeler, upon obtaining access to the Log-Loss scores on its predictions, can exploit this information to infer all the ground truth labels of arbitrary test datasets with full accuracy. We provide an efficient algorithm to perform this inference. A particularly interesting application where this attack can be exploited is to breach privacy in the setting of Membership Inference Attacks. These attacks exploit the vulnerabilities of exposing models trained on customer data to queries made by an adversary. Privacy auditing tools for measuring leakage from sensitive datasets assess the total privacy leakage based on the adversary’s predictions for datapoint membership. An instance of the proposed attack can hence, cause complete membership privacy breach, obviating any attack model training or access to side knowledge with the adversary. Moreover, our algorithm is agnostic to the model under attack and hence, enables perfect membership inference even for models that do not memorize or overfit. In particular, our observations provide insight into the extent of information leakage from statistical aggregates and how they can be exploited.
Anthology ID:
2020.privatenlp-1.1
Volume:
Proceedings of the Second Workshop on Privacy in NLP
Month:
November
Year:
2020
Address:
Online
Venue:
PrivateNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/2020.privatenlp-1.1
DOI:
10.18653/v1/2020.privatenlp-1.1
Bibkey:
Cite (ACL):
Abhinav Aggarwal, Zekun Xu, Oluwaseyi Feyisetan, and Nathanael Teissier. 2020. On Log-Loss Scores and (No) Privacy. In Proceedings of the Second Workshop on Privacy in NLP, pages 1–6, Online. Association for Computational Linguistics.
Cite (Informal):
On Log-Loss Scores and (No) Privacy (Aggarwal et al., PrivateNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.privatenlp-1.1.pdf
Video:
 https://slideslive.com/38939769