The Trade-offs of Domain Adaptation for Neural Language Models

David Grangier, Dan Iter


Abstract
This work connects language model adaptation with concepts of machine learning theory. We consider a training setup with a large out-of-domain set and a small in-domain set. We derive how the benefit of training a model on either set depends on the size of the sets and the distance between their underlying distributions. We analyze how out-of-domain pre-training before in-domain fine-tuning achieves better generalization than either solution independently. Finally, we present how adaptation techniques based on data selection, such as importance sampling, intelligent data selection and influence functions, can be presented in a common framework which highlights their similarity and also their subtle differences.
Anthology ID:
2022.acl-long.264
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3802–3813
Language:
URL:
https://aclanthology.org/2022.acl-long.264
DOI:
10.18653/v1/2022.acl-long.264
Bibkey:
Cite (ACL):
David Grangier and Dan Iter. 2022. The Trade-offs of Domain Adaptation for Neural Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3802–3813, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
The Trade-offs of Domain Adaptation for Neural Language Models (Grangier & Iter, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.264.pdf
Video:
 https://aclanthology.org/2022.acl-long.264.mp4