Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution

Laura Aina, Xixian Liao, Gemma Boleda, Matthijs Westera


Abstract
It is often posited that more predictable parts of a speaker’s meaning tend to be made less explicit, for instance using shorter, less informative words. Studying these dynamics in the domain of referring expressions has proven difficult, with existing studies, both psycholinguistic and corpus-based, providing contradictory results. We test the hypothesis that speakers produce less informative referring expressions (e.g., pronouns vs. full noun phrases) when the context is more informative about the referent, using novel computational estimates of referent predictability. We obtain these estimates training an existing coreference resolution system for English on a new task, masked coreference resolution, giving us a probability distribution over referents that is conditioned on the context but not the referring expression. The resulting system retains standard coreference resolution performance while yielding a better estimate of human-derived referent predictability than previous attempts. A statistical analysis of the relationship between model output and mention form supports the hypothesis that predictability affects the form of a mention, both its morphosyntactic type and its length.
Anthology ID:
2021.conll-1.36
Volume:
Proceedings of the 25th Conference on Computational Natural Language Learning
Month:
November
Year:
2021
Address:
Online
Venues:
CoNLL | EMNLP
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
454–469
Language:
URL:
https://aclanthology.org/2021.conll-1.36
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.conll-1.36.pdf
Code
 amore-upf/masked-coreference