When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it

Sebastian Schuster, Tal Linzen


Abstract
Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as ‘a dog’, frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, ‘a dog’ in ‘Arthur doesn’t own a dog’ does not introduce a discourse entity due to the presence of negation. In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. We use this evaluation suite for a fine-grained investigation of the entity tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find that while the models are to a certain extent sensitive to the interactions we investigate, they are all challenged by the presence of multiple NPs and their behavior is not systematic, which suggests that even models at the scale of GPT-3 do not fully acquire basic entity tracking abilities.
Anthology ID:
2022.naacl-main.71
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
969–982
Language:
URL:
https://aclanthology.org/2022.naacl-main.71
DOI:
10.18653/v1/2022.naacl-main.71
Bibkey:
Cite (ACL):
Sebastian Schuster and Tal Linzen. 2022. When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 969–982, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it (Schuster & Linzen, NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.71.pdf
Video:
 https://aclanthology.org/2022.naacl-main.71.mp4
Code
 sebschu/discourse-entity-lm
Data
LAMBADA