Peru Bhardwaj
2021
Poisoning Knowledge Graph Embeddings via Relation Inference Patterns
Peru Bhardwaj
|
John Kelleher
|
Luca Costabello
|
Declan O’Sullivan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
We study the problem of generating data poisoning attacks against Knowledge Graph Embedding (KGE) models for the task of link prediction in knowledge graphs. To poison KGE models, we propose to exploit their inductive abilities which are captured through the relationship patterns like symmetry, inversion and composition in the knowledge graph. Specifically, to degrade the model’s prediction confidence on target facts, we propose to improve the model’s prediction confidence on a set of decoy facts. Thus, we craft adversarial additions that can improve the model’s prediction confidence on decoy facts through different inference patterns. Our experiments demonstrate that the proposed poisoning attacks outperform state-of-art baselines on four KGE models for two publicly available datasets. We also find that the symmetry pattern based attacks generalize across all model-dataset combinations which indicates the sensitivity of KGE models to this pattern.
Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods
Peru Bhardwaj
|
John Kelleher
|
Luca Costabello
|
Declan O’Sullivan
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Despite the widespread use of Knowledge Graph Embeddings (KGE), little is known about the security vulnerabilities that might disrupt their intended behaviour. We study data poisoning attacks against KGE models for link prediction. These attacks craft adversarial additions or deletions at training time to cause model failure at test time. To select adversarial deletions, we propose to use the model-agnostic instance attribution methods from Interpretable Machine Learning, which identify the training instances that are most influential to a neural model’s predictions on test instances. We use these influential triples as adversarial deletions. We further propose a heuristic method to replace one of the two entities in each influential triple to generate adversarial additions. Our experiments show that the proposed strategies outperform the state-of-art data poisoning attacks on KGE models and improve the MRR degradation due to the attacks by up to 62% over the baselines.