Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data

Ekaterina Borisova; Fabio Barth; Nils Feldhus; Raia Abu Ahmad; Malte Ostendorff; Pedro Ortiz Suarez; Georg Rehm; Sebastian Möller

doi:10.18653/v1/2025.trl-1.10

Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data

Ekaterina Borisova, Fabio Barth, Nils Feldhus, Raia Abu Ahmad, Malte Ostendorff, Pedro Ortiz Suarez, Georg Rehm, Sebastian Möller

Abstract

Tables are among the most widely used tools for representing structured data in research, business, medicine, and education. Although LLMs demonstrate strong performance in downstream tasks, their efficiency in processing tabular data remains underexplored. In this paper, we investigate the effectiveness of both text-based and multimodal LLMs on table understanding tasks through a cross-domain and cross-modality evaluation. Specifically, we compare their performance on tables from scientific vs. non-scientific contexts and examine their robustness on tables represented as images vs. text. Additionally, we conduct an interpretability analysis to measure context usage and input relevance. We also introduce the TableEval benchmark, comprising 3017 tables from scholarly publications, Wikipedia, and financial reports, where each table is provided in five different formats: Image, Dictionary, HTML, XML, and LaTeX. Our findings indicate that while LLMs maintain robustness across table modalities, they face significant challenges when processing scientific tables.

Anthology ID:: 2025.trl-1.10
Volume:: Proceedings of the 4th Table Representation Learning Workshop
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Shuaichen Chang, Madelon Hulsebos, Qian Liu, Wenhu Chen, Huan Sun
Venues:: TRL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 109–142
Language:
URL:: https://aclanthology.org/2025.trl-1.10/
DOI:: 10.18653/v1/2025.trl-1.10
Bibkey:
Cite (ACL):: Ekaterina Borisova, Fabio Barth, Nils Feldhus, Raia Abu Ahmad, Malte Ostendorff, Pedro Ortiz Suarez, Georg Rehm, and Sebastian Möller. 2025. Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data. In Proceedings of the 4th Table Representation Learning Workshop, pages 109–142, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data (Borisova et al., TRL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.trl-1.10.pdf

PDF Cite Search Fix data