E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks

Nikolaos Stylianou, Ioannis Vlahavas


Abstract
In the last decade, the field of Neural Language Modelling has witnessed enormous changes, with the development of novel models through the use of Transformer architectures. However, even these models struggle to model long sequences due to memory constraints and increasing computational complexity. Coreference annotations over the training data can provide context far beyond the modelling limitations of such language models. In this paper we present an extension over the Transformer-block architecture used in neural language models, specifically in GPT2, in order to incorporate entity annotations during training. Our model, GPT2E, extends the Transformer layers architecture of GPT2 to Entity-Transformers, an architecture designed to handle coreference information when present. To that end, we achieve richer representations for entity mentions, with insignificant training cost. We show the comparative model performance between GPT2 and GPT2E in terms of Perplexity on the CoNLL 2012 and LAMBADA datasets as well as the key differences in the entity representations and their effects in downstream tasks such as Named Entity Recognition. Furthermore, our approach can be adopted by the majority of Transformer-based language models.
Anthology ID:
2020.crac-1.1
Volume:
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference
Month:
December
Year:
2020
Address:
Barcelona, Spain (online)
Editors:
Maciej Ogrodniczuk, Vincent Ng, Yulia Grishina, Sameer Pradhan
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2020.crac-1.1
DOI:
Bibkey:
Cite (ACL):
Nikolaos Stylianou and Ioannis Vlahavas. 2020. E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks. In Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference, pages 1–10, Barcelona, Spain (online). Association for Computational Linguistics.
Cite (Informal):
E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks (Stylianou & Vlahavas, CRAC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.crac-1.1.pdf
Data
CoNLL-2012LAMBADA