How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks

Zhiying Jiang, Raphael Tang, Ji Xin, Jimmy Lin


Abstract
Fine-tuned pre-trained transformers achieve the state of the art in passage reranking. Unfortunately, how they make their predictions remains vastly unexplained, especially at the end-to-end, input-to-output level. Little known is how tokens, layers, and passages precisely contribute to the final prediction. In this paper, we address this gap by leveraging the recently developed information bottlenecks for attribution (IBA) framework. On BERT-based models for passage reranking, we quantitatively demonstrate the framework’s veracity in extracting attribution maps, from which we perform detailed, token-wise analysis about how predictions are made. Overall, we find that BERT still cares about exact token matching for reranking; the [CLS] token mainly gathers information for predictions at the last layer; top-ranked passages are robust to token removal; and BERT fine-tuned on MSMARCO has positional bias towards the start of the passage.
Anthology ID:
2021.blackboxnlp-1.39
Volume:
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venues:
BlackboxNLP | EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
496–509
Language:
URL:
https://aclanthology.org/2021.blackboxnlp-1.39
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.blackboxnlp-1.39.pdf
Data
MS MARCO