Nonparametric Masked Language Modeling

Sewon Min; Weijia Shi; Mike Lewis; Xilun Chen; Wen-tau Yih; Hannaneh Hajishirzi; Luke Zettlemoyer

doi:10.18653/v1/2023.findings-acl.132

Nonparametric Masked Language Modeling

Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer

Abstract

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric masked language model that replaces this softmax with a nonparametric distribution over every phrase in a reference corpus. NPM fills in the [MASK] solely from retrieving a token from a text corpus. We show that NPM can be efficiently trained with a contrastive objective and an in-batch approximation to full corpus retrieval. Zero-shot evaluation on 16 tasks including classification, fact probing and question answering demonstrates that NPM outperforms significantly larger parametric models, either with or without a retrieve-and-generate approach. It is particularly better at dealing with rare patterns (word senses or facts) and predicting rare or nearly unseen words (e.g., non-Latin script). We release the model and code at github.com/facebookresearch/NPM.

Anthology ID:: 2023.findings-acl.132
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2097–2118
Language:
URL:: https://aclanthology.org/2023.findings-acl.132
DOI:: 10.18653/v1/2023.findings-acl.132
Bibkey:
Cite (ACL):: Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2023. Nonparametric Masked Language Modeling. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2097–2118, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Nonparametric Masked Language Modeling (Min et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.132.pdf

PDF Cite Search