Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages

Xu Han, Yuqi Luo, Weize Chen, Zhiyuan Liu, Maosong Sun, Zhou Botong, Hao Fei, Suncong Zheng


Abstract
Fine-grained entity typing (FGET) aims to classify named entity mentions into fine-grained entity types, which is meaningful for entity-related NLP tasks. For FGET, a key challenge is the low-resource problem — the complex entity type hierarchy makes it difficult to manually label data. Especially for those languages other than English, human-labeled data is extremely scarce. In this paper, we propose a cross-lingual contrastive learning framework to learn FGET models for low-resource languages. Specifically, we use multi-lingual pre-trained language models (PLMs) as the backbone to transfer the typing knowledge from high-resource languages (such as English) to low-resource languages (such as Chinese). Furthermore, we introduce entity-pair-oriented heuristic rules as well as machine translation to obtain cross-lingual distantly-supervised data, and apply cross-lingual contrastive learning on the distantly-supervised data to enhance the backbone PLMs. Experimental results show that by applying our framework, we can easily learn effective FGET models for low-resource languages, even without any language-specific human-labeled data. Our code is also available at https://github.com/thunlp/CrossET.
Anthology ID:
2022.acl-long.159
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2241–2250
Language:
URL:
https://aclanthology.org/2022.acl-long.159
DOI:
10.18653/v1/2022.acl-long.159
Bibkey:
Cite (ACL):
Xu Han, Yuqi Luo, Weize Chen, Zhiyuan Liu, Maosong Sun, Zhou Botong, Hao Fei, and Suncong Zheng. 2022. Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2241–2250, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages (Han et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.159.pdf
Software:
 2022.acl-long.159.software.zip
Code
 thunlp/crosset
Data
Few-NERDOpen Entity