AusTalk: an audio-visual corpus of Australian English

Dominique Estival, Steve Cassidy, Felicity Cox, Denis Burnham


Abstract
This paper describes the AusTalk corpus, which was designed and created through the Big ASC, a collaborative project with the two main goals of providing a standardised infrastructure for audio-visual recordings in Australia and of producing a large audio-visual corpus of Australian English, with 3 hours of AV recordings for 1000 speakers. We first present the overall project, then describe the corpus itself and its components, the strict data collection protocol with high levels of standardisation and automation, and the processes put in place for quality control. We also discuss the annotation phase of the project, along with its goals and challenges; a major contribution of the project has been to explore procedures for automating annotations and we present our solutions. We conclude with the current status of the corpus and with some examples of research already conducted with this new resource. AusTalk is one of the corpora included in the HCS vLab, which is briefly sketched in the conclusion.
Anthology ID:
L14-1432
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3105–3109
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/520_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Dominique Estival, Steve Cassidy, Felicity Cox, and Denis Burnham. 2014. AusTalk: an audio-visual corpus of Australian English. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3105–3109, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
AusTalk: an audio-visual corpus of Australian English (Estival et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/520_Paper.pdf