Evaluating Structured Decoding for Text-to-Table Generation: Evidence from Three Datasets

Julian Oestreich, Lydia Müller


Abstract
We present a comprehensive evaluation of structured decoding for text-to-table generation with large language models (LLMs). While previous work has primarily focused on unconstrained generation of tables, the impact of enforcing structural constraints during generation remains underexplored. We systematically compare schema-guided (structured) decoding to standard one-shot prompting across three diverse benchmarks - E2E, Rotowire, and Livesum - using open-source LLMs of up to 32B parameters, assessing the performance of table generation approaches in resource-constrained settings. Our experiments cover a wide range of evaluation metrics at cell, row, and table levels. Results demonstrate that structured decoding significantly enhances the validity and alignment of generated tables, particularly in scenarios demanding precise numerical alignment (Rotowire), but may degrade performance in contexts involving densely packed textual information (E2E) or extensive aggregation over lengthy texts (Livesum). We further analyze the suitability of different evaluation metrics and discuss the influence of model size.
Anthology ID:
2025.r2lm-1.12
Volume:
Proceedings of the First Workshop on Comparative Performance Evaluation: From Rules to Language Models
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Alicia Picazo-Izquierdo, Ernesto Luis Estevanell-Valladares, Ruslan Mitkov, Rafael Muñoz Guillena, Raúl García Cerdá
Venues:
R2LM | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
113–122
Language:
URL:
https://aclanthology.org/2025.r2lm-1.12/
DOI:
Bibkey:
Cite (ACL):
Julian Oestreich and Lydia Müller. 2025. Evaluating Structured Decoding for Text-to-Table Generation: Evidence from Three Datasets. In Proceedings of the First Workshop on Comparative Performance Evaluation: From Rules to Language Models, pages 113–122, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Evaluating Structured Decoding for Text-to-Table Generation: Evidence from Three Datasets (Oestreich & Müller, R2LM 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.r2lm-1.12.pdf