Korean Children’s Spoken English Corpus and an Analysis of its Pronunciation Variability

Hyejin Hong, Sunhee Kim, Minhwa Chung


Abstract
This paper introduces a corpus of Korean-accented English speech produced by children (the Korean Children's Spoken English Corpus: the KC-SEC), which is constructed by Seoul National University. The KC-SEC was developed in support of research and development of CALL systems for Korean learners of English, especially for elementary school learners. It consists of read-speech produced by 96 Korean learners aged from 9 to 12. Overall corpus size is 11,937 sentences, which amount to about 16 hours of speech. Furthermore, a statistical analysis of pronunciation variability appearing in the corpus is performed in order to investigate the characteristics of the Korean children's spoken English. The realized phonemes (hypothesis) are extracted through time-based phoneme alignment, and are compared to the targeted phonemes (reference). The results of the analysis show that: i) the pronunciation variations found frequently in Korean children's speech are devoicing and changing of articulation place or/and manner; and ii) they largely correspond to those of general Korean learners' speech presented in previous studies, despite some differences.
Anthology ID:
L12-1398
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2362–2365
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/683_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Hyejin Hong, Sunhee Kim, and Minhwa Chung. 2012. Korean Children’s Spoken English Corpus and an Analysis of its Pronunciation Variability. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2362–2365, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Korean Children’s Spoken English Corpus and an Analysis of its Pronunciation Variability (Hong et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/683_Paper.pdf