You Are My Type! Type Embeddings for Pre-trained Language Models

Mohammed Saeed, Paolo Papotti


Abstract
One reason for the positive impact of Pre-trained Language Models (PLMs) in NLP tasks is their ability to encode semantic types, such as ‘European City’ or ‘Woman’. While previous work has analyzed such information in the context of interpretability, it is not clear how to use types to steer the PLM output. For example, in a cloze statement, it is desirable to steer the model to generate a token that satisfies a user-specified type, e.g., predict a date rather than a location. In this work, we introduce Type Embeddings (TEs), an input embedding that promotes desired types in a PLM. Our proposal is to define a type by a small set of word examples. We empirically study the ability of TEs both in representing types and in steering masking predictions without changes to the prompt text in BERT. Finally, using the LAMA datasets, we show how TEs highly improve the precision in extracting facts from PLMs.
Anthology ID:
2022.findings-emnlp.336
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4583–4598
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.336
DOI:
10.18653/v1/2022.findings-emnlp.336
Bibkey:
Cite (ACL):
Mohammed Saeed and Paolo Papotti. 2022. You Are My Type! Type Embeddings for Pre-trained Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4583–4598, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
You Are My Type! Type Embeddings for Pre-trained Language Models (Saeed & Papotti, Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.336.pdf