Language Modeling with Sparse Product of Sememe Experts

Yihong Gu, Jun Yan, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Fen Lin, Leyu Lin


Abstract
Most language modeling methods rely on large-scale data to statistically learn the sequential patterns of words. In this paper, we argue that words are atomic language units but not necessarily atomic semantic units. Inspired by HowNet, we use sememes, the minimum semantic units in human languages, to represent the implicit semantics behind words for language modeling, named Sememe-Driven Language Model (SDLM). More specifically, to predict the next word, SDLM first estimates the sememe distribution given textual context. Afterwards, it regards each sememe as a distinct semantic expert, and these experts jointly identify the most probable senses and the corresponding word. In this way, SDLM enables language models to work beyond word-level manipulation to fine-grained sememe-level semantics, and offers us more powerful tools to fine-tune language models and improve the interpretability as well as the robustness of language models. Experiments on language modeling and the downstream application of headline generation demonstrate the significant effectiveness of SDLM.
Anthology ID:
D18-1493
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4642–4651
Language:
URL:
https://aclanthology.org/D18-1493
DOI:
10.18653/v1/D18-1493
Bibkey:
Cite (ACL):
Yihong Gu, Jun Yan, Hao Zhu, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Fen Lin, and Leyu Lin. 2018. Language Modeling with Sparse Product of Sememe Experts. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4642–4651, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Language Modeling with Sparse Product of Sememe Experts (Gu et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1493.pdf
Attachment:
 D18-1493.Attachment.pdf
Code
 thunlp/SDLM-pytorch
Data
LCSTSPenn Treebank