An Investigation of the (In)effectiveness of Counterfactually Augmented Data

Nitish Joshi; He He

doi:10.18653/v1/2022.acl-long.256

An Investigation of the (In)effectiveness of Counterfactually Augmented Data

Abstract

While pretrained language models achieve excellent performance on natural language understanding benchmarks, they tend to rely on spurious correlations and generalize poorly to out-of-distribution (OOD) data. Recent work has explored using counterfactually-augmented data (CAD)—data generated by minimally perturbing examples to flip the ground-truth label—to identify robust features that are invariant under distribution shift. However, empirical results using CAD during training for OOD generalization have been mixed. To explain this discrepancy, through a toy theoretical example and empirical analysis on two crowdsourced CAD datasets, we show that: (a) while features perturbed in CAD are indeed robust features, it may prevent the model from learning unperturbed robust features; and (b) CAD may exacerbate existing spurious correlations in the data. Our results thus show that the lack of perturbation diversity limits CAD’s effectiveness on OOD generalization, calling for innovative crowdsourcing procedures to elicit diverse perturbation of examples.

Anthology ID:: 2022.acl-long.256
Volume:: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3668–3681
Language:
URL:: https://aclanthology.org/2022.acl-long.256
DOI:: 10.18653/v1/2022.acl-long.256
Bibkey:
Cite (ACL):: Nitish Joshi and He He. 2022. An Investigation of the (In)effectiveness of Counterfactually Augmented Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3668–3681, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: An Investigation of the (In)effectiveness of Counterfactually Augmented Data (Joshi & He, ACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.acl-long.256.pdf
Software:: 2022.acl-long.256.software.zip
Video:: https://aclanthology.org/2022.acl-long.256.mp4
Code: joshinh/investigation-cad
Data: BoolQ, MultiNLI

PDF Cite Search Code Software Video