A Package for Learning on Tabular and Text Data with Transformers

Ken Gu, Akshay Budhkar


Abstract
Recent progress in natural language processing has led to Transformer architectures becoming the predominant model used for natural language tasks. However, in many real- world datasets, additional modalities are included which the Transformer does not directly leverage. We present Multimodal- Toolkit, an open-source Python package to incorporate text and tabular (categorical and numerical) data with Transformers for downstream applications. Our toolkit integrates well with Hugging Face’s existing API such as tokenization and the model hub which allows easy download of different pre-trained models.
Anthology ID:
2021.maiworkshop-1.10
Volume:
Proceedings of the Third Workshop on Multimodal Artificial Intelligence
Month:
June
Year:
2021
Address:
Mexico City, Mexico
Editors:
Amir Zadeh, Louis-Philippe Morency, Paul Pu Liang, Candace Ross, Ruslan Salakhutdinov, Soujanya Poria, Erik Cambria, Kelly Shi
Venue:
maiworkshop
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–73
Language:
URL:
https://aclanthology.org/2021.maiworkshop-1.10
DOI:
10.18653/v1/2021.maiworkshop-1.10
Bibkey:
Cite (ACL):
Ken Gu and Akshay Budhkar. 2021. A Package for Learning on Tabular and Text Data with Transformers. In Proceedings of the Third Workshop on Multimodal Artificial Intelligence, pages 69–73, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
A Package for Learning on Tabular and Text Data with Transformers (Gu & Budhkar, maiworkshop 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.maiworkshop-1.10.pdf
Code
 georgian-io/Multimodal-Toolkit