Discovering Phonesthemes with Sparse Regularization

Nelson F. Liu, Gina-Anne Levow, Noah A. Smith


Abstract
We introduce a simple method for extracting non-arbitrary form-meaning representations from a collection of semantic vectors. We treat the problem as one of feature selection for a model trained to predict word vectors from subword features. We apply this model to the problem of automatically discovering phonesthemes, which are submorphemic sound clusters that appear in words with similar meaning. Many of our model-predicted phonesthemes overlap with those proposed in the linguistics literature, and we validate our approach with human judgments.
Anthology ID:
W18-1206
Volume:
Proceedings of the Second Workshop on Subword/Character LEvel Models
Month:
June
Year:
2018
Address:
New Orleans
Editors:
Manaal Faruqui, Hinrich Schütze, Isabel Trancoso, Yulia Tsvetkov, Yadollah Yaghoobzadeh
Venue:
SCLeM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–54
Language:
URL:
https://aclanthology.org/W18-1206/
DOI:
10.18653/v1/W18-1206
Bibkey:
Cite (ACL):
Nelson F. Liu, Gina-Anne Levow, and Noah A. Smith. 2018. Discovering Phonesthemes with Sparse Regularization. In Proceedings of the Second Workshop on Subword/Character LEvel Models, pages 49–54, New Orleans. Association for Computational Linguistics.
Cite (Informal):
Discovering Phonesthemes with Sparse Regularization (Liu et al., SCLeM 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-1206.pdf