Modeling Sense Structure in Word Usage Graphs with the Weighted Stochastic Block Model

Dominik Schlechtweg, Enrique Castaneda, Jonas Kuhn, Sabine Schulte im Walde


Abstract
We suggest to model human-annotated Word Usage Graphs capturing fine-grained semantic proximity distinctions between word uses with a Bayesian formulation of the Weighted Stochastic Block Model, a generative model for random graphs popular in biology, physics and social sciences. By providing a probabilistic model of graded word meaning we aim to approach the slippery and yet widely used notion of word sense in a novel way. The proposed framework enables us to rigorously compare models of word senses with respect to their fit to the data. We perform extensive experiments and select the empirically most adequate model.
Anthology ID:
2021.starsem-1.23
Volume:
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics
Month:
August
Year:
2021
Address:
Online
Venues:
*SEM | ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
241–251
Language:
URL:
https://aclanthology.org/2021.starsem-1.23
DOI:
10.18653/v1/2021.starsem-1.23
Bibkey:
Cite (ACL):
Dominik Schlechtweg, Enrique Castaneda, Jonas Kuhn, and Sabine Schulte im Walde. 2021. Modeling Sense Structure in Word Usage Graphs with the Weighted Stochastic Block Model. In Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, pages 241–251, Online. Association for Computational Linguistics.
Cite (Informal):
Modeling Sense Structure in Word Usage Graphs with the Weighted Stochastic Block Model (Schlechtweg et al., *SEM 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.starsem-1.23.pdf
Code
 kicasta/modeling_wugs_wsbm