ChildTalk: A Multi-Dialect Chinese Child Speech Corpus with Full-Length Child–Caregiver Conversations for Speech Recognition

Jiaming Zhou; Yujie Guo; Shiwan Zhao; Yao Lu; Jianye Wang; Haoqin Sun; Hui Wang; Yong Qin

ChildTalk: A Multi-Dialect Chinese Child Speech Corpus with Full-Length Child–Caregiver Conversations for Speech Recognition

Jiaming Zhou, Yujie Guo, Shiwan Zhao, Yao Lu, Jianye Wang, Haoqin Sun, Hui Wang, Yong Qin

Abstract

Automatic speech recognition (ASR) for children remains challenging due to developmental variability and the scarcity of high-quality corpora, especially for Mandarin and its dialects. In this paper, we present ChildTalk, a large-scale Chinese child speech corpus designed to address this gap. It contains 112.5 hours of speech from 498 children (aged 2–8) and 500 caregivers, recorded as natural child–caregiver conversations. Unlike prior Mandarin child ASR corpora that mainly release isolated utterances, ChildTalk provides full-length dialogues with complete transcriptions, preserving turn-taking and discourse context. To our knowledge, it is the first publicly available Mandarin child speech corpus with full-length dialogues and systematic coverage of standard Mandarin, eight Mandarin dialect subgroups, and two additional dialects (Southern Min and Jin). We benchmark end-to-end models trained from scratch, large pre-trained ASR models fine-tuned on ChildTalk, omni-modal LLMs in a zero-shot setting, and commercial speech transcription APIs. Fine-tuning on ChildTalk consistently improves both in-domain and cross-domain performance. These results indicate that ChildTalk provides a challenging, broad-coverage testbed for Chinese child ASR, dialect robustness, and dialogue-level modeling. The dataset will be made freely available for all academic purposes.

Anthology ID:: 2026.findings-acl.251
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5103–5116
Language:
URL:: https://aclanthology.org/2026.findings-acl.251/
DOI:
Bibkey:
Cite (ACL):: Jiaming Zhou, Yujie Guo, Shiwan Zhao, Yao Lu, Jianye Wang, Haoqin Sun, Hui Wang, and Yong Qin. 2026. ChildTalk: A Multi-Dialect Chinese Child Speech Corpus with Full-Length Child–Caregiver Conversations for Speech Recognition. In Findings of the Association for Computational Linguistics: ACL 2026, pages 5103–5116, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ChildTalk: A Multi-Dialect Chinese Child Speech Corpus with Full-Length Child–Caregiver Conversations for Speech Recognition (Zhou et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.251.pdf
Checklist:: 2026.findings-acl.251.checklist.pdf

PDF Cite Search Checklist Fix data