Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Prajjwal Bhargava, Aleksandr Drozd, Anna Rogers


Abstract
Much of recent progress in NLU was shown to be due to models’ learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.
Anthology ID:
2021.insights-1.18
Volume:
Proceedings of the Second Workshop on Insights from Negative Results in NLP
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
João Sedoc, Anna Rogers, Anna Rumshisky, Shabnam Tafreshi
Venue:
insights
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
125–135
Language:
URL:
https://aclanthology.org/2021.insights-1.18
DOI:
10.18653/v1/2021.insights-1.18
Bibkey:
Cite (ACL):
Prajjwal Bhargava, Aleksandr Drozd, and Anna Rogers. 2021. Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics. In Proceedings of the Second Workshop on Insights from Negative Results in NLP, pages 125–135, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics (Bhargava et al., insights 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.insights-1.18.pdf
Video:
 https://aclanthology.org/2021.insights-1.18.mp4
Code
 prajjwal1/generalize_lm_nli
Data
MultiNLI