David Arnau


2024

pdf bib
Beyond the Hype: Identifying and Analyzing Math Word Problem-Solving Challenges for Large Language Models
Romina Soledad Albornoz-De Luise | David Arnau | Pablo Arnau-González | Miguel Arevalillo-Herráez
Proceedings of the 2nd Workshop on Practical LLM-assisted Data-to-Text Generation

Despite not being explicitly trained for this purpose, models like Mistral and LLaMA have demonstrated impressive results across numerous tasks, including generating solutions to Mathematical Word Problems (MWPs). A MWP involves translating a textual description into a mathematical model or equation that solving it. However, these models face challenges in accurately interpreting and utilizing the numerical information present in the MWP statements, which can lead to errors in the generated solutions. To better understand the limitations of LLMs, we analyzed the MWP where models failed to accurately solve problems from the SVAMP dataset. By categorizing these MWPs, we identify specific types of problems where the models are most prone to errors, providing insights into the underlying challenges faced by LLMs in problem-solving scenarios and open new modeling opportunities. By understanding the expected errors, researchers can design strategies to adequately model problems more effectively and choose the most suitable LLM for solving them taking into account each model’s strengths and weaknesses.