Annotating progressive aspect constructions in the spoken section of the British National Corpus

Andrew Caines, Paula Buttery


Abstract
We present a set of stand-off annotations for the ninety thousand sentences in the spoken section of the British National Corpus (BNC) which feature a progressive aspect verb group. These annotations may be matched to the original BNC text using the supplied document and sentence identifiers. The annotated features mostly relate to linguistic form: subject type, subject person and number, form of auxiliary verb, and clause type, tense and polarity. In addition, the sentences are classified for register, the formality of recording context: three levels of `spontaneity' with genres such as sermons and scripted speech at the most formal level and casual conversation at the least formal. The resource has been designed so that it may easily be augmented with further stand-off annotations. Expert linguistic annotations of spoken data, such as these, are valuable for improving the performance of natural language processing tools in the spoken language domain and assist linguistic research in general.
Anthology ID:
L12-1648
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1699–1704
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1087_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Andrew Caines and Paula Buttery. 2012. Annotating progressive aspect constructions in the spoken section of the British National Corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1699–1704, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Annotating progressive aspect constructions in the spoken section of the British National Corpus (Caines & Buttery, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1087_Paper.pdf