A Self-Supervised Integration Method of Pretrained Language Models and Word Definitions

Hwiyeol Jo


Abstract
We investigate the representation of pretrained language models and humans, using the idea of word definition modeling–how well a word is represented by its definition, and vice versa. Our analysis shows that a word representation in pretrained language models does not successfully map its human-written definition and its usage in example sentences. We then present a simple method DefBERT that integrates pretrained models with word semantics in dictionaries. We show its benefits on newly-proposed tasks of definition ranking and definition sense disambiguation. Furthermore, we present the results on standard word similarity tasks and short text classification tasks where models are required to encode semantics with only a few words. The results demonstrate the effectiveness of integrating word definitions and pretrained language models.
Anthology ID:
2023.findings-acl.2
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–26
Language:
URL:
https://aclanthology.org/2023.findings-acl.2
DOI:
10.18653/v1/2023.findings-acl.2
Bibkey:
Cite (ACL):
Hwiyeol Jo. 2023. A Self-Supervised Integration Method of Pretrained Language Models and Word Definitions. In Findings of the Association for Computational Linguistics: ACL 2023, pages 14–26, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Self-Supervised Integration Method of Pretrained Language Models and Word Definitions (Jo, Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.2.pdf
Video:
 https://aclanthology.org/2023.findings-acl.2.mp4