Constructing an Annotated Story Corpus: Some Observations and Issues

Oi Yee Kwong


Abstract
This paper discusses our ongoing work on constructing an annotated corpus of children’s stories for further studies on the linguistic, computational, and cognitive aspects of story structure and understanding. Given its semantic nature and the need for extensive common sense and world knowledge, story understanding has been a notoriously difficult topic in natural language processing. In particular, the notion of story structure for maintaining coherence has received much attention, while its strong version in the form of story grammar has triggered much debate. The relation between discourse coherence and the interestingness, or the point, of a story has not been satisfactorily settled. Introspective analysis on story comprehension has led to some important observations, based on which we propose a preliminary annotation scheme covering the structural, functional, and emotional aspects connecting discourse segments in stories. The annotation process will shed light on how story structure interacts with story point via various linguistic devices, and the annotated corpus is expected to be a useful resource for computational discourse processing, especially for studying various issues regarding the interface between coherence and interestingness of stories.
Anthology ID:
L10-1509
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/736_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Oi Yee Kwong. 2010. Constructing an Annotated Story Corpus: Some Observations and Issues. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Constructing an Annotated Story Corpus: Some Observations and Issues (Kwong, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/736_Paper.pdf