%0 Conference Proceedings %T IMPLI: Investigating NLI Models’ Performance on Figurative Language %A Stowe, Kevin %A Utama, Prasetya %A Gurevych, Iryna %Y Muresan, Smaranda %Y Nakov, Preslav %Y Villavicencio, Aline %S Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) %D 2022 %8 May %I Association for Computational Linguistics %C Dublin, Ireland %F stowe-etal-2022-impli %X Natural language inference (NLI) has been widely used as a task to train and evaluate models for language understanding. However, the ability of NLI models to perform inferences requiring understanding of figurative language such as idioms and metaphors remains understudied. We introduce the IMPLI (Idiomatic and Metaphoric Paired Language Inference) dataset, an English dataset consisting of paired sentences spanning idioms and metaphors. We develop novel methods to generate 24k semiautomatic pairs as well as manually creating 1.8k gold pairs. We use IMPLI to evaluate NLI models based on RoBERTa fine-tuned on the widely used MNLI dataset. We then show that while they can reliably detect entailment relationship between figurative phrases with their literal counterparts, they perform poorly on similarly structured examples where pairs are designed to be non-entailing. This suggests the limits of current NLI models with regard to understanding figurative language and this dataset serves as a benchmark for future improvements in this direction. %R 10.18653/v1/2022.acl-long.369 %U https://aclanthology.org/2022.acl-long.369 %U https://doi.org/10.18653/v1/2022.acl-long.369 %P 5375-5388