Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations

Ying Li, Zhenghua Li, Min Zhang


Abstract
In recent years, parsing performance is dramatically improved on in-domain texts thanks to the rapid progress of deep neural network models. The major challenge for current parsing research is to improve parsing performance on out-of-domain texts that are very different from the in-domain training data when there is only a small-scale out-domain labeled data. To deal with this problem, we propose to improve the contextualized word representations via adversarial learning and fine-tuning BERT processes. Concretely, we apply adversarial learning to three representative semi-supervised domain adaption methods, i.e., direct concatenation (CON), feature augmentation (FA), and domain embedding (DE) with two useful strategies, i.e., fused target-domain word representations and orthogonality constraints, thus enabling to model more pure yet effective domain-specific and domain-invariant representations. Simultaneously, we utilize a large-scale target-domain unlabeled data to fine-tune BERT with only the language model loss, thus obtaining reliable contextualized word representations that benefit for the cross-domain dependency parsing. Experiments on a benchmark dataset show that our proposed adversarial approaches achieve consistent improvement, and fine-tuning BERT further boosts parsing accuracy by a large margin. Our single model achieves the same state-of-the-art performance as the top submitted system in the NLPCC-2019 shared task, which uses ensemble models and BERT.
Anthology ID:
2020.coling-main.338
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3806–3817
Language:
URL:
https://aclanthology.org/2020.coling-main.338
DOI:
10.18653/v1/2020.coling-main.338
Bibkey:
Cite (ACL):
Ying Li, Zhenghua Li, and Min Zhang. 2020. Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3806–3817, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations (Li et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.338.pdf