Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text

Junjie Xing, Kenny Zhu, Shaodian Zhang


Abstract
Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain. However, building domain-specific CWS requires extremely high annotation cost. In this paper, we propose an approach by exploiting domain-invariant knowledge from high resource to low resource domains. Extensive experiments show that our model achieves consistently higher accuracy than the single-task CWS and other transfer learning baselines, especially when there is a large disparity between source and target domains.
Anthology ID:
C18-1307
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3619–3630
Language:
URL:
https://aclanthology.org/C18-1307
DOI:
Bibkey:
Cite (ACL):
Junjie Xing, Kenny Zhu, and Shaodian Zhang. 2018. Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3619–3630, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text (Xing et al., COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1307.pdf
Code
 adapt-sjtu/AMTTL