Pseudo-Label Guided Unsupervised Domain Adaptation of Contextual Embeddings

Tianyu Chen, Shaohan Huang, Furu Wei, Jianxin Li


Abstract
Contextual embedding models such as BERT can be easily fine-tuned on labeled samples to create a state-of-the-art model for many downstream tasks. However, the fine-tuned BERT model suffers considerably from unlabeled data when applied to a different domain. In unsupervised domain adaptation, we aim to train a model that works well on a target domain when provided with labeled source samples and unlabeled target samples. In this paper, we propose a pseudo-label guided method for unsupervised domain adaptation. Two models are fine-tuned on labeled source samples as pseudo labeling models. To learn representations for the target domain, one of those models is adapted by masked language modeling from the target domain. Then those models are used to assign pseudo-labels to target samples. We train the final model with those samples. We evaluate our method on named entity segmentation and sentiment analysis tasks. These experiments show that our approach outperforms baseline methods.
Anthology ID:
2021.adaptnlp-1.2
Volume:
Proceedings of the Second Workshop on Domain Adaptation for NLP
Month:
April
Year:
2021
Address:
Kyiv, Ukraine
Editors:
Eyal Ben-David, Shay Cohen, Ryan McDonald, Barbara Plank, Roi Reichart, Guy Rotman, Yftah Ziser
Venue:
AdaptNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–15
Language:
URL:
https://aclanthology.org/2021.adaptnlp-1.2
DOI:
Bibkey:
Cite (ACL):
Tianyu Chen, Shaohan Huang, Furu Wei, and Jianxin Li. 2021. Pseudo-Label Guided Unsupervised Domain Adaptation of Contextual Embeddings. In Proceedings of the Second Workshop on Domain Adaptation for NLP, pages 9–15, Kyiv, Ukraine. Association for Computational Linguistics.
Cite (Informal):
Pseudo-Label Guided Unsupervised Domain Adaptation of Contextual Embeddings (Chen et al., AdaptNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.adaptnlp-1.2.pdf
Data
CoNLL 2003