Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference

Sara Rajaee; Yadollah Yaghoobzadeh; Mohammad Taher Pilehvar

doi:10.18653/v1/2022.emnlp-main.725

Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference

Sara Rajaee, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

Abstract

It has been shown that NLI models are usually biased with respect to the word-overlap between the premise and the hypothesis, as they take this feature as a primary cue for predicting the entailment label. In this paper, we focus on an overlooked aspect of the overlap bias in the NLI models: the reverse word-overlap bias. Our experimental results demonstrate that current NLI systems are also highly biased towards the non-entailment label on instances with low overlap and that existing debiasing methods, which are reportedly successful on challenge datasets, are generally ineffective in addressing this category of bias.Through a set of analyses, we investigate the reasons for the emergence of the overlap bias and the role of minority examples in mitigating this bias.For the former, we find that the word overlap bias does not stem from pre-training, and in the latter, we observe that in contrast to the accepted assumption, eliminating minority examples does not affect the generalizability of debiasing methods with respect to the overlap bias.

Anthology ID:: 2022.emnlp-main.725
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10605–10616
Language:
URL:: https://aclanthology.org/2022.emnlp-main.725/
DOI:: 10.18653/v1/2022.emnlp-main.725
Bibkey:
Cite (ACL):: Sara Rajaee, Yadollah Yaghoobzadeh, and Mohammad Taher Pilehvar. 2022. Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10605–10616, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference (Rajaee et al., EMNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.emnlp-main.725.pdf

PDF Cite Search Fix data