TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Chia-Chien Hung; Lukas Lange; Jannik Strötgen

doi:10.18653/v1/2023.findings-acl.31

TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Chia-Chien Hung, Lukas Lange, Jannik Strötgen

Abstract

Intermediate training of pre-trained transformer-based language models on domain-specific data leads to substantial gains for downstream tasks. To increase efficiency and prevent catastrophic forgetting alleviated from full domain-adaptive pre-training, approaches such as adapters have been developed. However, these require additional parameters for each layer, and are criticized for their limited expressiveness. In this work, we introduce TADA, a novel task-agnostic domain adaptation method which is modular, parameter-efficient, and thus, data-efficient. Within TADA, we retrain the embeddings to learn domain-aware input representations and tokenizers for the transformer encoder, while freezing all other parameters of the model. Then, task-specific fine-tuning is performed. We further conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases. Our broad evaluation in 4 downstream tasks for 14 domains across single- and multi-domain setups and high- and low-resource scenarios reveals that TADA is an effective and efficient alternative to full domain-adaptive pre-training and adapters for domain adaptation, while not introducing additional parameters or complex training steps.

Anthology ID:: 2023.findings-acl.31
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 487–503
Language:
URL:: https://aclanthology.org/2023.findings-acl.31/
DOI:: 10.18653/v1/2023.findings-acl.31
Bibkey:
Cite (ACL):: Chia-Chien Hung, Lukas Lange, and Jannik Strötgen. 2023. TADA: Efficient Task-Agnostic Domain Adaptation for Transformers. In Findings of the Association for Computational Linguistics: ACL 2023, pages 487–503, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: TADA: Efficient Task-Agnostic Domain Adaptation for Transformers (Hung et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.31.pdf
Video:: https://aclanthology.org/2023.findings-acl.31.mp4

PDF Cite Search Video Fix data