Language Model Pre-Training with Sparse Latent Typing

Liliang Ren, Zixuan Zhang, Han Wang, Clare Voss, ChengXiang Zhai, Heng Ji


Abstract
Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at https://github.com/renll/SparseLT.
Anthology ID:
2022.emnlp-main.96
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1480–1494
Language:
URL:
https://aclanthology.org/2022.emnlp-main.96
DOI:
10.18653/v1/2022.emnlp-main.96
Bibkey:
Cite (ACL):
Liliang Ren, Zixuan Zhang, Han Wang, Clare Voss, ChengXiang Zhai, and Heng Ji. 2022. Language Model Pre-Training with Sparse Latent Typing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1480–1494, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Language Model Pre-Training with Sparse Latent Typing (Ren et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.96.pdf