Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

Houxing Ren; Mingjie Zhan; Zimu Lu; Ke Wang; Yunqiao Yang; Haotian Hou; Hongsheng Li

Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

Houxing Ren, Mingjie Zhan, Zimu Lu, Ke Wang, Yunqiao Yang, Haotian Hou, Hongsheng Li

Abstract

Spreadsheets are central to real-world applications such as enterprise reporting, auditing, and scientific data management. Despite their ubiquity, existing large language model based approaches typically treat tables as plain text, overlooking critical layout cues and visual semantics. Moreover, real-world spreadsheets are often massive in scale, exceeding the input length that LLMs can efficiently process. To address these challenges, we propose SpreadsheetAgent, a two-stage multi-agent framework for spreadsheet understanding that adopts a step-by-step reading and reasoning paradigm. Instead of loading the entire spreadsheet at once, SpreadsheetAgent incrementally interprets localized regions through multiple modalities, including code execution results, images, and LaTeX tables. The method first constructs a structural sketch and row/column summaries, and then performs task-driven reasoning over this intermediate representation in the Solving Stage. To further enhance reliability, we design a verification module that validates extracted structures via targeted inspections, reducing error propagation and ensuring trustworthy inputs for downstream reasoning. Extensive experiments on two spreadsheet datasets demonstrate the effectiveness of our approach. With GPT-OSS-120B, SpreadsheetAgent achieves 38.16% on Spreadsheet Bench, outperforming the ChatGPT Agent baseline (35.27%) by 2.89 absolute points. These results highlight the potential of SpreadsheetAgent to advance robust and scalable spreadsheet understanding in real-world applications. Code is available at https://github.com/renhouxing/SpreadsheetAgent.

Anthology ID:: 2026.acl-long.86
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1906–1933
Language:
URL:: https://aclanthology.org/2026.acl-long.86/
DOI:
Bibkey:
Cite (ACL):: Houxing Ren, Mingjie Zhan, Zimu Lu, Ke Wang, Yunqiao Yang, Haotian Hou, and Hongsheng Li. 2026. Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1906–1933, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning (Ren et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.86.pdf
Checklist:: 2026.acl-long.86.checklist.pdf

PDF Cite Search Checklist Fix data