QTSumm: Query-Focused Summarization over Tabular Data

Yilun Zhao; Zhenting Qi; Linyong Nan; Boyu Mi; Yixin Liu; Weijin Zou; Simeng Han; Ruizhe Chen; Xiangru Tang; Yumo Xu; Dragomir Radev; Arman Cohan

doi:10.18653/v1/2023.emnlp-main.74

QTSumm: Query-Focused Summarization over Tabular Data

Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, Arman Cohan

Abstract

People primarily consult tables to conduct data analysis or answer specific questions. Text generation systems that can provide accurate table summaries tailored to users’ information needs can facilitate more efficient access to relevant data insights. Motivated by this, we define a new query-focused table summarization task, where text generation models have to perform human-like reasoning and analysis over the given table to generate a tailored summary. We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables covering diverse topics. We investigate a set of strong baselines on QTSumm, including text generation, table-to-text generation, and large language models. Experimental results and manual analysis reveal that the new task presents significant challenges in table-to-text generation for future research. Moreover, we propose a new approach named ReFactor, to retrieve and reason over query-relevant information from tabular data to generate several natural language facts. Experimental results demonstrate that ReFactor can bring effective improvements to baselines by concatenating the generated facts to the model input. Our data and code are publicly available at https://github.com/yale-nlp/QTSumm.

Anthology ID:: 2023.emnlp-main.74
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1157–1172
Language:
URL:: https://aclanthology.org/2023.emnlp-main.74
DOI:: 10.18653/v1/2023.emnlp-main.74
Bibkey:
Cite (ACL):: Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan. 2023. QTSumm: Query-Focused Summarization over Tabular Data. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1157–1172, Singapore. Association for Computational Linguistics.
Cite (Informal):: QTSumm: Query-Focused Summarization over Tabular Data (Zhao et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.74.pdf
Video:: https://aclanthology.org/2023.emnlp-main.74.mp4

PDF Cite Search Video