BERTese: Learning to Speak to BERT

Adi Haviv, Jonathan Berant, Amir Globerson


Abstract
Large pre-trained language models have been shown to encode large amounts of world and commonsense knowledge in their parameters, leading to substantial interest in methods for extracting that knowledge. In past work, knowledge was extracted by taking manually-authored queries and gathering paraphrases for them using a separate pipeline. In this work, we propose a method for automatically rewriting queries into “BERTese”, a paraphrase query that is directly optimized towards better knowledge extraction. To encourage meaningful rewrites, we add auxiliary loss functions that encourage the query to correspond to actual language tokens. We empirically show our approach outperforms competing baselines, obviating the need for complex pipelines. Moreover, BERTese provides some insight into the type of language that helps language models perform knowledge extraction.
Anthology ID:
2021.eacl-main.316
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3618–3623
Language:
URL:
https://aclanthology.org/2021.eacl-main.316
DOI:
10.18653/v1/2021.eacl-main.316
Bibkey:
Cite (ACL):
Adi Haviv, Jonathan Berant, and Amir Globerson. 2021. BERTese: Learning to Speak to BERT. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3618–3623, Online. Association for Computational Linguistics.
Cite (Informal):
BERTese: Learning to Speak to BERT (Haviv et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.316.pdf
Data
LAMAT-REx