TabPrompt: Graph-based Pre-training and Prompting for Few-shot Table Understanding

Rihui Jin, Jianan Wang, Wei Tan, Yongrui Chen, Guilin Qi, Wang Hao


Abstract
Table Understanding (TU) is a crucial aspect of information extraction that enables machines to comprehend the semantics behind tabular data. However, existing methods of TU cannot deal with the scarcity of labeled tabular data. In addition, these methods primarily focus on the textual content within the table, disregarding the inherent topological information of the table. This can lead to a misunderstanding of the tabular semantics. In this paper, we propose TabPrompt, a new framework to tackle the above challenges. Prompt-based learning has gained popularity due to its exceptional performance in few-shot learning. Thus, we introduce prompt-based learning to handle few-shot TU. Furthermore, Graph Contrastive Learning (Graph CL) demonstrates remarkable capabilities in capturing topological information, making Graph Neural Networks an ideal method for encoding tables. Hence, we develop a novel Graph CL method tailored to tabular data. This method serves as the pretext task during the pre-training phase, allowing the generation of vector representations that incorporate the table’s topological information. The experimental results of outperforming all strong baselines demonstrate the strength of our method in few-shot table understanding tasks.
Anthology ID:
2023.findings-emnlp.493
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7373–7383
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.493
DOI:
10.18653/v1/2023.findings-emnlp.493
Bibkey:
Cite (ACL):
Rihui Jin, Jianan Wang, Wei Tan, Yongrui Chen, Guilin Qi, and Wang Hao. 2023. TabPrompt: Graph-based Pre-training and Prompting for Few-shot Table Understanding. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7373–7383, Singapore. Association for Computational Linguistics.
Cite (Informal):
TabPrompt: Graph-based Pre-training and Prompting for Few-shot Table Understanding (Jin et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.493.pdf