Desiderata for the Annotation of Information Structure in Complex Sentences

Hannah Booth


Abstract
Many annotation schemes for information structure have been developed in recent years (Calhoun et al., 2005; Paggio, 2006; Goetze et al., 2007; Bohnet et al., 2013; Riester et al., 2018), in line with increased attention on the interaction between discourse and other linguistic dimensions (e.g. syntax, semantics, prosody). However, a crucial issue which existing schemes either gloss over, or propose only crude guidelines for, is how to annotate information structure in complex sentences. This unsatisfactory treatment is unsurprising given that theoretical work on information structure has traditionally neglected its status in dependent clauses. In this paper, I evaluate the status of pre-existing annotation schemes in relation to this vexed issue, and outline certain desiderata as a foundation for novel, more nuanced approaches, informed by state-of-the art theoretical insights (Erteschik-Shir, 2007; Bianchi and Frascarelli, 2010; Lahousse, 2010; Ebert et al., 2014; Matic et al., 2014; Lahousse, 2022). These desiderata relate both to annotation formats and the annotation process. The practical implications of these desiderata are illustrated via a test case using the Corpus of Historical Low German (Booth et al., 2020). The paper overall showcases the benefits which result from a free exchange between linguistic annotation models and theoretical research.
Anthology ID:
2022.law-1.5
Volume:
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Sameer Pradhan, Sandra Kuebler
Venue:
LAW
SIG:
SIGANN
Publisher:
European Language Resources Association
Note:
Pages:
31–43
Language:
URL:
https://aclanthology.org/2022.law-1.5
DOI:
Bibkey:
Cite (ACL):
Hannah Booth. 2022. Desiderata for the Annotation of Information Structure in Complex Sentences. In Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022, pages 31–43, Marseille, France. European Language Resources Association.
Cite (Informal):
Desiderata for the Annotation of Information Structure in Complex Sentences (Booth, LAW 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.law-1.5.pdf