Influence of Text Type and Text Length on Anaphoric Annotation

Daniela Goecke, Maik Stührenberg, Andreas Witt


Abstract
We report the results of a study that investigates the agreement of anaphoric annotations. The study focuses on the influence of the factors text length and text type on a corpus of scientific articles and newspaper texts. In order to measure inter-annotator agreement we compare existing approaches and we propose to measure each step of the annotation process separately instead of measuring the resulting anaphoric relations only. A total amount of 3,642 anaphoric relations has been annotated for a corpus of 53,038 tokens (12,327 markables). The results of the study show that text type has more influence on inter-annotator agreement than text length. Furthermore, the definition of well-defined annotation instructions and coder training is a crucial point in order to receive good annotation results.
Anthology ID:
L08-1330
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/368_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Daniela Goecke, Maik Stührenberg, and Andreas Witt. 2008. Influence of Text Type and Text Length on Anaphoric Annotation. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Influence of Text Type and Text Length on Anaphoric Annotation (Goecke et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/368_paper.pdf