Adversarial Examples for Evaluating Math Word Problem Solvers

Vivek Kumar, Rishabh Maheshwary, Vikram Pudi


Abstract
Standard accuracy metrics have shown that Math Word Problem (MWP) solvers have achieved high performance on benchmark datasets. However, the extent to which existing MWP solvers truly understand language and its relation with numbers is still unclear. In this paper, we generate adversarial attacks to evaluate the robustness of state-of-the-art MWP solvers. We propose two methods, Question Reordering and Sentence Paraphrasing to generate adversarial attacks. We conduct experiments across three neural MWP solvers over two benchmark datasets. On average, our attack method is able to reduce the accuracy of MWP solvers by over 40% on these datasets. Our results demonstrate that existing MWP solvers are sensitive to linguistic variations in the problem text. We verify the validity and quality of generated adversarial examples through human evaluation.
Anthology ID:
2021.findings-emnlp.230
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2705–2712
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.230
DOI:
10.18653/v1/2021.findings-emnlp.230
Bibkey:
Cite (ACL):
Vivek Kumar, Rishabh Maheshwary, and Vikram Pudi. 2021. Adversarial Examples for Evaluating Math Word Problem Solvers. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2705–2712, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Adversarial Examples for Evaluating Math Word Problem Solvers (Kumar et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.230.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.230.mp4
Code
 kevivk/mwp_adversarial
Data
ASDivMAWPS