Morphological Complexity of Children Narratives in Eight Languages

Gordana Hržica, Chaya Liebeskind, Kristina Š. Despot, Olga Dontcheva-Navratilova, Laura Kamandulytė-Merfeldienė, Sara Košutar, Matea Kramarić, Giedrė Valūnaitė Oleškevičienė


Abstract
The aim of this study was to compare the morphological complexity in a corpus representing the language production of younger and older children across different languages. The language samples were taken from the Frog Story subcorpus of the CHILDES corpora, which comprises oral narratives collected by various researchers between 1990 and 2005. We extracted narratives by typically developing, monolingual, middle-class children. Additionally, samples of Lithuanian language, collected according to the same principles, were added. The corpus comprises 249 narratives evenly distributed across eight languages: Croatian, English, French, German, Italian, Lithuanian, Russian and Spanish. Two subcorpora were formed for each language: a younger children corpus and an older children corpus. Four measures of morphological complexity were calculated for each subcorpus: Bane, Kolmogorov, Word entropy and Relative entropy of word structure. The results showed that younger children corpora had lower morphological complexity than older children corpora for all four measures for Spanish and Russian. Reversed results were obtained for English and French, and the results for the remaining four languages showed variation. Relative entropy of word structure proved to be indicative of age differences. Word entropy and relative entropy of word structure show potential to demonstrate typological differences.
Anthology ID:
2022.lrec-1.506
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
4729–4738
Language:
URL:
https://aclanthology.org/2022.lrec-1.506
DOI:
Bibkey:
Cite (ACL):
Gordana Hržica, Chaya Liebeskind, Kristina Š. Despot, Olga Dontcheva-Navratilova, Laura Kamandulytė-Merfeldienė, Sara Košutar, Matea Kramarić, and Giedrė Valūnaitė Oleškevičienė. 2022. Morphological Complexity of Children Narratives in Eight Languages. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4729–4738, Marseille, France. European Language Resources Association.
Cite (Informal):
Morphological Complexity of Children Narratives in Eight Languages (Hržica et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.506.pdf