Identification of Adjective-Noun Neologisms using Pretrained Language Models

John Philip McCrae


Abstract
Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compositional adjective-noun pairs by using pretrained language models and comparing this with individual word embeddings. Our results show that the use of these models significantly improves over baseline linguistic features, however the combination with linguistic features still further improves the results, suggesting the strength of a hybrid approach.
Anthology ID:
W19-5116
Volume:
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Agata Savary, Carla Parra Escartín, Francis Bond, Jelena Mitrović, Verginica Barbu Mititelu
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
135–141
Language:
URL:
https://aclanthology.org/W19-5116
DOI:
10.18653/v1/W19-5116
Bibkey:
Cite (ACL):
John Philip McCrae. 2019. Identification of Adjective-Noun Neologisms using Pretrained Language Models. In Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), pages 135–141, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Identification of Adjective-Noun Neologisms using Pretrained Language Models (McCrae, MWE 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5116.pdf
Code
 jmccrae/adj-noun-neologism-identification