Ewa Andrejczuk
2022
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
|
Julian Eisenschlos
|
Francesco Piccinno
|
Syrine Krichene
|
Yasemin Altun
Findings of the Association for Computational Linguistics: EMNLP 2022
Encoder-only transformer models have been successfully applied to different table understanding tasks, as in TAPAS. A major limitation of these architectures is that they are constrained to classification-like tasks such as cell selection or entailment detection. We present TabT5, an encoder-decoder model that generates natural language text based on tables and textual inputs. TabT5 overcomes the encoder-only limitation by incorporating a decoder component and leverages the input structure with table specific embeddings and pre-training. TabT5 achieves new state-of-the-art results on several domains, including spreadsheet formula prediction with a 15% increase in sequence accuracy, QA with a 2.5% increase in sequence accuracy and data-to-text generation with a 2.5% increase in BLEU.