Automatically Generated Definitions and their utility for Modeling Word Meaning

Francesco Periti, David Alfter, Nina Tahmasebi


Abstract
Modeling lexical semantics is a challenging task, often suffering from interpretability pitfalls. In this paper, we delve into the generation of dictionary-like sense definitions and explore their utility for modeling word meaning. We fine-tuned two Llama models and include an existing T5-based model in our evaluation. Firstly, we evaluate the quality of the generated definitions on existing English benchmarks, setting new state-of-the-art results for the Definition Generation task. Next, we explore the use of definitions generated by our models as intermediate representations subsequently encoded as sentence embeddings. We evaluate this approach on lexical semantics tasks such as the Word-in-Context, Word Sense Induction, and Lexical Semantic Change, setting new state-of-the-art results in all three tasks when compared to unsupervised baselines.
Anthology ID:
2024.emnlp-main.776
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14008–14026
Language:
URL:
https://aclanthology.org/2024.emnlp-main.776
DOI:
Bibkey:
Cite (ACL):
Francesco Periti, David Alfter, and Nina Tahmasebi. 2024. Automatically Generated Definitions and their utility for Modeling Word Meaning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 14008–14026, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Automatically Generated Definitions and their utility for Modeling Word Meaning (Periti et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.776.pdf
Software:
 2024.emnlp-main.776.software.zip
Data:
 2024.emnlp-main.776.data.zip