Word Sense Disambiguation as a Game of Neurosymbolic Darts

Tiansi Dong, Rafet Sifa


Abstract
Word Sense Disambiguation (WSD) is one of the hardest tasks in natural language understanding and knowledge engineering. The glass ceiling of the 80% F1 score is recently achieved through supervised learning, enriched by knowledge graphs. Here, we propose a novel neurosymbolic methodology that may push the F1 score above 90%. The core of our methodology is a neurosymbolic sense embedding, in terms of a configuration of nested n-dimensional balls. The central point of a ball well preserves pre-trained word embeddings learned from data, which partially fixes the locations of balls. Inclusion relations among balls precisely encode symbolic hypernym relations among senses, and enable simple logic deduction among sense embeddings. We trained a Transformer to learn the mapping from a contextualized word embedding to its sense ball embedding, just like playing the game of darts (a game of shooting darts into a dartboard). A series of experiments are carried out using pre-training n ball embeddings, which cover around 70% training data and 75% testing data in the benchmark WSD corpus. Euclidean distance and cosine similarity functions are used as objective functions, separately, and each reaches >95.0% F1 score in the ALL-n-ball dataset. This substantially breaks the glass ceiling of deep learning methods. Future work is discussed to develop a full-fledged neurosymbolic WSD system that substantially outperforms deep learning approaches.
Anthology ID:
2024.neusymbridge-1.3
Volume:
Proceedings of the Workshop: Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning (NeusymBridge) @ LREC-COLING-2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Tiansi Dong, Erhard Hinrichs, Zhen Han, Kang Liu, Yangqiu Song, Yixin Cao, Christian F. Hempelmann, Rafet Sifa
Venues:
NeusymBridge | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
22–32
Language:
URL:
https://aclanthology.org/2024.neusymbridge-1.3
DOI:
Bibkey:
Cite (ACL):
Tiansi Dong and Rafet Sifa. 2024. Word Sense Disambiguation as a Game of Neurosymbolic Darts. In Proceedings of the Workshop: Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning (NeusymBridge) @ LREC-COLING-2024, pages 22–32, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Word Sense Disambiguation as a Game of Neurosymbolic Darts (Dong & Sifa, NeusymBridge-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.neusymbridge-1.3.pdf