Local Structure Matters Most: Perturbation Study in NLU

Louis Clouatre, Prasanna Parthasarathi, Amal Zouaq, Sarath Chandar


Abstract
Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.
Anthology ID:
2022.findings-acl.293
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3712–3731
Language:
URL:
https://aclanthology.org/2022.findings-acl.293
DOI:
10.18653/v1/2022.findings-acl.293
Bibkey:
Cite (ACL):
Louis Clouatre, Prasanna Parthasarathi, Amal Zouaq, and Sarath Chandar. 2022. Local Structure Matters Most: Perturbation Study in NLU. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3712–3731, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Local Structure Matters Most: Perturbation Study in NLU (Clouatre et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.293.pdf
Software:
 2022.findings-acl.293.software.zip
Video:
 https://aclanthology.org/2022.findings-acl.293.mp4
Data
GLUE