ArMATH: a Dataset for Solving Arabic Math Word Problems

Reem Alghamdi, Zhenwen Liang, Xiangliang Zhang


Abstract
This paper studies solving Arabic Math Word Problems by deep learning. A Math Word Problem (MWP) is a text description of a mathematical problem that can be solved by deriving a math equation to reach the answer. Effective models have been developed for solving MWPs in English and Chinese. However, Arabic MWPs are rarely studied. This paper contributes the first large-scale dataset for Arabic MWPs, which contains 6,000 samples of primary-school math problems, written in Modern Standard Arabic (MSA). Arabic MWP solvers are then built with deep learning models and evaluated on this dataset. In addition, a transfer learning model is built to let the high-resource Chinese MWP solver promote the performance of the low-resource Arabic MWP solver. This work is the first to use deep learning methods to solve Arabic MWP and the first to use transfer learning to solve MWP across different languages. The transfer learning enhanced solver has an accuracy of 74.15%, which is 3% higher than the solver without using transfer learning. We make the dataset and solvers available in public for encouraging more research of Arabic MWPs: https://github.com/reem-codes/ArMATH
Anthology ID:
2022.lrec-1.37
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
351–362
Language:
URL:
https://aclanthology.org/2022.lrec-1.37
DOI:
Bibkey:
Cite (ACL):
Reem Alghamdi, Zhenwen Liang, and Xiangliang Zhang. 2022. ArMATH: a Dataset for Solving Arabic Math Word Problems. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 351–362, Marseille, France. European Language Resources Association.
Cite (Informal):
ArMATH: a Dataset for Solving Arabic Math Word Problems (Alghamdi et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.37.pdf
Code
 reem-codes/armath
Data
MAWPSMath23KMathQA