Using Punctuation as an Adversarial Attack on Deep Learning-Based NLP Systems: An Empirical Study

Brian Formento, Chuan Sheng Foo, Luu Anh Tuan, See Kiong Ng


Abstract
This work empirically investigates punctuation insertions as adversarial attacks on NLP systems. Data from experiments on three tasks, five datasets, and six models with four attacks show that punctuation insertions, when limited to a few symbols (apostrophes and hyphens), are a superior attack vector compared to character insertions due to 1) a lower after-attack accuracy (Aaft-atk) than alphabetical character insertions; 2) higher semantic similarity between the resulting and original texts; and 3) a resulting text that is easier and faster to read as assessed with the Test of Word Reading Efficiency (TOWRE)). The tests also indicate that 4) grammar checking does not mitigate punctuation insertions and 5) punctuation insertions outperform word-level attacks in settings with a limited number of word synonyms and queries to the victim’s model. Our findings indicate that inserting a few punctuation types that result in easy-to-read samples is a general attack mechanism. In light of this threat, we assess the impact of punctuation insertions, potential mitigations, the mitigation’s tradeoffs, punctuation insertion’s worst-case scenarios and summarize our findings in a qualitative casual map, so that developers can design safer, more secure systems.
Anthology ID:
2023.findings-eacl.1
Volume:
Findings of the Association for Computational Linguistics: EACL 2023
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–34
Language:
URL:
https://aclanthology.org/2023.findings-eacl.1
DOI:
10.18653/v1/2023.findings-eacl.1
Bibkey:
Cite (ACL):
Brian Formento, Chuan Sheng Foo, Luu Anh Tuan, and See Kiong Ng. 2023. Using Punctuation as an Adversarial Attack on Deep Learning-Based NLP Systems: An Empirical Study. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1–34, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Using Punctuation as an Adversarial Attack on Deep Learning-Based NLP Systems: An Empirical Study (Formento et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-eacl.1.pdf
Software:
 2023.findings-eacl.1.software.zip
Video:
 https://aclanthology.org/2023.findings-eacl.1.mp4