UniMath: A Foundational and Multimodal Mathematical Reasoner

Zhenwen Liang; Tianyu Yang; Jipeng Zhang; Xiangliang Zhang

doi:10.18653/v1/2023.emnlp-main.440

UniMath: A Foundational and Multimodal Mathematical Reasoner

Zhenwen Liang, Tianyu Yang, Jipeng Zhang, Xiangliang Zhang

Abstract

While significant progress has been made in natural language processing (NLP), existing methods exhibit limitations in effectively interpreting and processing diverse mathematical modalities. Therefore, we introduce UniMath, a versatile and unified system designed for multimodal mathematical reasoning tasks. Tackling complex problem-solving in arithmetic, geometry, and table-based math, UniMath utilizes a fine-tuned T5 model augmented with a variational autoencoder (VAE)-based image tokenizer. By jointly training and evaluating the model on three diverse datasets - SVAMP, GeoQA, and TableMWP, UniMath achieves state-of-the-art performance. The model’s generalization ability is further demonstrated via fine-tuning on two additional datasets, MathQA and Geo-Proving. Through comprehensive evaluations, we showcase that joint training across diverse math tasks improves overall model performance and enhances its ability to generalize across different mathematical reasoning tasks. This pioneering approach provides a blueprint and inspires further efforts on unified mathematical reasoning with deep learning systems.

Anthology ID:: 2023.emnlp-main.440
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7126–7133
Language:
URL:: https://aclanthology.org/2023.emnlp-main.440/
DOI:: 10.18653/v1/2023.emnlp-main.440
Bibkey:
Cite (ACL):: Zhenwen Liang, Tianyu Yang, Jipeng Zhang, and Xiangliang Zhang. 2023. UniMath: A Foundational and Multimodal Mathematical Reasoner. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 7126–7133, Singapore. Association for Computational Linguistics.
Cite (Informal):: UniMath: A Foundational and Multimodal Mathematical Reasoner (Liang et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.440.pdf
Video:: https://aclanthology.org/2023.emnlp-main.440.mp4

PDF Cite Search Video Fix data