Domain Adaptation with BERT-based Domain Classification and Data Selection

Xiaofei Ma; Peng Xu; Zhiguo Wang; Ramesh Nallapati; Bing Xiang

doi:10.18653/v1/D19-6109

Domain Adaptation with BERT-based Domain Classification and Data Selection

Xiaofei Ma, Peng Xu, Zhiguo Wang, Ramesh Nallapati, Bing Xiang

Abstract

The performance of deep neural models can deteriorate substantially when there is a domain shift between training and test data. For example, the pre-trained BERT model can be easily fine-tuned with just one additional output layer to create a state-of-the-art model for a wide range of tasks. However, the fine-tuned BERT model suffers considerably at zero-shot when applied to a different domain. In this paper, we present a novel two-step domain adaptation framework based on curriculum learning and domain-discriminative data selection. The domain adaptation is conducted in a mostly unsupervised manner using a small target domain validation set for hyper-parameter tuning. We tested the framework on four large public datasets with different domain similarities and task types. Our framework outperforms a popular discrepancy-based domain adaptation method on most transfer tasks while consuming only a fraction of the training budget.

Anthology ID:: D19-6109
Volume:: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Colin Cherry, Greg Durrett, George Foster, Reza Haffari, Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta
Venue:: WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 76–83
Language:
URL:: https://aclanthology.org/D19-6109/
DOI:: 10.18653/v1/D19-6109
Bibkey:
Cite (ACL):: Xiaofei Ma, Peng Xu, Zhiguo Wang, Ramesh Nallapati, and Bing Xiang. 2019. Domain Adaptation with BERT-based Domain Classification and Data Selection. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pages 76–83, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Domain Adaptation with BERT-based Domain Classification and Data Selection (Ma et al., 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-6109.pdf

PDF Cite Search Fix data