SenseBERT: Driving Some Sense into BERT
Yoav Levine | Barak Lenz | Or Dagan | Ori Ram | Dan Padnos | Or Sharir | Shai Shalev-Shwartz | Amnon Shashua | Yoav Shoham
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding. However, existing self-supervision techniques operate at the word form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ weak-supervision directly at the word sense level. Our model, named SenseBERT, is pre-trained to predict not only the masked words but also their WordNet supersenses. Accordingly, we attain a lexical-semantic level language model, without the use of human annotation. SenseBERT achieves significantly improved lexical understanding, as we demonstrate by experimenting on SemEval Word Sense Disambiguation, and by attaining a state of the art result on the ‘Word in Context’ task.