Transformers for Tabular Data Representation: A Survey of Models and Applications

Gilbert Badaro, Mohammed Saeed, Paolo Papotti


Abstract
In the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this article, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions.
Anthology ID:
2023.tacl-1.14
Volume:
Transactions of the Association for Computational Linguistics, Volume 11
Month:
Year:
2023
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
227–249
Language:
URL:
https://aclanthology.org/2023.tacl-1.14
DOI:
10.1162/tacl_a_00544
Bibkey:
Cite (ACL):
Gilbert Badaro, Mohammed Saeed, and Paolo Papotti. 2023. Transformers for Tabular Data Representation: A Survey of Models and Applications. Transactions of the Association for Computational Linguistics, 11:227–249.
Cite (Informal):
Transformers for Tabular Data Representation: A Survey of Models and Applications (Badaro et al., TACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.tacl-1.14.pdf