Revisiting the Givenness Hierarchy. A Corpus-Based Evaluation

Christian Chiarcos

doi:10.18653/v1/2025.crac-1.3

Revisiting the Givenness Hierarchy. A Corpus-Based Evaluation

Abstract

Gundel et al.’s Givenness Hierarchy remains one of the most influental frameworks of Information Status to this date, and has been employed in different technical contexts to account for context-sensitive and hearer-tailored language in human-machine interaction and natural language processing as well as as a topic of linguistic inquiry. At the same time, the data basis upon which this theory has been developed remains relatively thin. Although its applicability to a broad array of languages has been repeatedly confirmed, the empirical evidence presented for certain phenomena, and in particular, with respect to demonstrative determiners and demonstrative pronouns did not always reach conventional levels of statistical significance. In this paper, we provide an empirical, corpus-based re-assessment of two seminal papers for the Givenness Hierarchy, Gundel et al. (1990) and Gundel et al. (1993), where we aim to replicate their findings on the basis of corpora with coreference annotation for their original sample of languages, i.e., Arabic, Chinese, English, Japanese, Korean, Russian and Spanish. We describe the operationalization of Gundel et al.’s ‘cognitive statuses’, their approximation by means of anaphoric relations, the preprocessing of diverse and heterogeneous corpora and evaluate Gundel et al.’s claims. Our contribution is three-fold: We evaluate the Givenness Hierarchy against quantitative data at a scale that allows to assess statistical significance, we discuss challenges and problems encountered in the process, in the preprocessing and in the interpretation of the diverse corpora, we provide two generalizations: a procedure for bootstrapping Givenness Hierarchies for other languages, and possible cross-linguistically applicable tendencies in the systems of referring expressions.

Anthology ID:: 2025.crac-1.3
Volume:: Proceedings of the Eighth Workshop on Computational Models of Reference, Anaphora and Coreference
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Maciej Ogrodniczuk, Michal Novak, Massimo Poesio, Sameer Pradhan, Vincent Ng
Venue:: CRAC
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24–41
Language:
URL:: https://aclanthology.org/2025.crac-1.3/
DOI:: 10.18653/v1/2025.crac-1.3
Bibkey:
Cite (ACL):: Christian Chiarcos. 2025. Revisiting the Givenness Hierarchy. A Corpus-Based Evaluation. In Proceedings of the Eighth Workshop on Computational Models of Reference, Anaphora and Coreference, pages 24–41, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Revisiting the Givenness Hierarchy. A Corpus-Based Evaluation (Chiarcos, CRAC 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.crac-1.3.pdf

PDF Cite Search Fix data