A Comparison of Strategies for Source-Free Domain Adaptation

Xin Su, Yiyun Zhao, Steven Bethard


Abstract
Data sharing restrictions are common in NLP, especially in the clinical domain, but there is limited research on adapting models to new domains without access to the original training data, a setting known as source-free domain adaptation. We take algorithms that traditionally assume access to the source-domain training data—active learning, self-training, and data augmentation—and adapt them for source free domain adaptation. Then we systematically compare these different strategies across multiple tasks and domains. We find that active learning yields consistent gains across all SemEval 2021 Task 10 tasks and domains, but though the shared task saw successful self-trained and data augmented models, our systematic comparison finds these strategies to be unreliable for source-free domain adaptation.
Anthology ID:
2022.acl-long.572
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8352–8367
Language:
URL:
https://aclanthology.org/2022.acl-long.572
DOI:
10.18653/v1/2022.acl-long.572
Bibkey:
Cite (ACL):
Xin Su, Yiyun Zhao, and Steven Bethard. 2022. A Comparison of Strategies for Source-Free Domain Adaptation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8352–8367, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
A Comparison of Strategies for Source-Free Domain Adaptation (Su et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.572.pdf
Code
 xinsu626/sourcefreedomainadaptation