Catia Cucchiarini

Also published as: C. Cucchiarini

2025

Investigating Further Fine-tuning Wav2vec2.0 in Low Resource Settings for Enhancing Children Speech Recognition and Word-level Reading Diagnosis
Lingyun Gao | Cristian Tejedor-Garcia | Catia Cucchiarini | Helmer Strik
Proceedings of AAAS Workshop 2025 – Automatic Assessment of Atypical Speech

2024

pdf bib abs

A Joint Approach for Automatic Analysis of Reading and Writing Errors
Wieke Harmsen | Catia Cucchiarini | Roeland van Hout | Helmer Strik
Proceedings of the Second Workshop on Computation and Written Language (CAWL) @ LREC-COLING 2024

Analyzing the errors that children make on their ways to becoming fluent readers and writers can provide invaluable scientific insights into the processes that underlie literacy acquisition. To this end, we present in this paper an extension of an earlier developed spelling error detection and classification algorithm for Dutch, so that reading errors can also be automatically detected from their phonetic transcription. The strength of this algorithm lies in its ability to detect errors at Phoneme-Corresponding Unit (PCU) level, where a PCU is a sequence of letters corresponding to one phoneme. We validated this algorithm and found good agreement between manual and automatic reading error classifications. We also used the algorithm to analyze written words by second graders and phonetic transcriptions of read words by first graders. With respect to the writing data, we found that the PCUs ‘ei’, ‘eu’, ‘g’, ‘ij’ and ‘ch’ were most frequently written incorrectly, for the reading data, these were the PCUs ‘v’, ‘ui’, ‘ng’, ‘a’ and ‘g’. This study presents a first attempt at developing a joint method for detecting reading and writing errors. In future research this algorithm can be used to analyze corpora containing reading and writing data from the same children.

2022

pdf bib abs

Multilingual Transfer Learning for Children Automatic Speech Recognition
Thomas Rolland | Alberto Abad | Catia Cucchiarini | Helmer Strik
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Despite recent advances in automatic speech recognition (ASR), the recognition of children’s speech still remains a significant challenge. This is mainly due to the high acoustic variability and the limited amount of available training data. The latter problem is particularly evident in languages other than English, which are usually less-resourced. In the current paper, we address children ASR in a number of less-resourced languages by combining several small-sized children speech corpora from these languages. In particular, we address the following research question: Does a novel two-step training strategy in which multilingual learning is followed by language-specific transfer learning outperform conventional single language/task training for children speech, as well as multilingual and transfer learning alone? Based on previous experimental results with English, we hypothesize that multilingual learning provides a better generalization of the underlying characteristics of children’s speech. Our results provide a positive answer to our research question, by showing that using transfer learning on top of a multilingual model for an unseen language outperforms conventional single language-specific learning.

We present an overview of LARA, the Learning And Reading Assistant, an open source platform for easy creation and use of multimedia annotated texts designed to support the improvement of reading skills. The paper is divided into three parts. In the first, we give a brief summary of LARA’s processing. In the second, we describe some generic functionality specially relevant for reading assistance: support for phonetically annotated texts, support for image-based texts, and integrated production of text-to-speech (TTS) generated audio. In the third, we outline some of the larger projects so far carried out with LARA, involving development of content for learning second and foreign (L2) languages such as Icelandic, Farsi, Irish, Old Norse and the Australian Aboriginal language Barngarla, where the issues involved overlap with those that arise when trying to help students improve first-language (L1) reading skills. All software and almost all content is freely available.

2020

pdf bib abs

An important objective in health-technology is the ability to gather information about people’s well-being. Structured interviews can be used to obtain this information, but are time-consuming and not scalable. Questionnaires provide an alternative way to extract such information, though typically lack depth. In this paper, we present our first prototype of the BLISS agent, an artificial intelligent agent which intends to automatically discover what makes people happy and healthy. The goal of Behaviour-based Language-Interactive Speaking Systems (BLISS) is to understand the motivations behind people’s happiness by conducting a personalized spoken dialogue based on a happiness model. We built our first prototype of the model to collect 55 spoken dialogues, in which the BLISS agent asked questions to users about their happiness and well-being. Apart from a description of the BLISS architecture, we also provide details about our dataset, which contains over 120 activities and 100 motivations and is made available for usage.

pdf bib abs

Dedicated Language Resources for Interdisciplinary Research on Multiword Expressions: Best Thing since Sliced Bread
Ferdy Hubers | Catia Cucchiarini | Helmer Strik
Proceedings of the Twelfth Language Resources and Evaluation Conference

Multiword expressions such as idioms (beat about the bush), collocations (plastic surgery) and lexical bundles (in the middle of) are challenging for disciplines like Natural Language Processing (NLP), psycholinguistics and second language acquisition, , due to their more or less fixed character. Idiomatic expressions are especially problematic, because they convey a figurative meaning that cannot always be inferred from the literal meanings of the component words. Researchers acknowledge that important properties that characterize idioms such as frequency of exposure, familiarity, transparency, and imageability, should be taken into account in research, but these are typically properties that rely on subjective judgments. This is probably one of the reasons why many studies that investigated idiomatic expressions collected limited information about idiom properties for very small numbers of idioms only. In this paper we report on cross-boundary work aimed at developing a set of tools and language resources that are considered crucial for this kind of multifaceted research. We discuss the results of our research and suggest possible avenues for future research

2016

pdf bib abs

Palabras: Crowdsourcing Transcriptions of L2 Speech
Eric Sanders | Pepi Burgos | Catia Cucchiarini | Roeland van Hout
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We developed a web application for crowdsourcing transcriptions of Dutch words spoken by Spanish L2 learners. In this paper we discuss the design of the application and the influence of metadata and various forms of feedback. Useful data were obtained from 159 participants, with an average of over 20 transcriptions per item, which seems a satisfactory result for this type of research. Informing participants about how many items they still had to complete, and not how many they had already completed, turned to be an incentive to do more items. Assigning participants a score for their performance made it more attractive for them to carry out the transcription task, but this seemed to influence their performance. We discuss possible advantages and disadvantages in connection with the aim of the research and consider possible lessons for designing future experiments.

pdf bib abs

A Dutch Dysarthric Speech Database for Individualized Speech Therapy Research
Emre Yilmaz | Mario Ganzeboom | Lilian Beijer | Catia Cucchiarini | Helmer Strik
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a new Dutch dysarthric speech database containing utterances of neurological patients with Parkinson’s disease, traumatic brain injury and cerebrovascular accident. The speech content is phonetically and linguistically diversified by using numerous structured sentence and word lists. Containing more than 6 hours of mildly to moderately dysarthric speech, this database can be used for research on dysarthria and for developing and testing speech-to-text systems designed for medical applications. Current activities aimed at extending this database are also discussed.

2014

pdf bib abs

ASR-based CALL systems and learner speech data: new resources and opportunities for research and development in second language learning
Catia Cucchiarini | Steve Bodnar | Bart Penning de Vries | Roeland van Hout | Helmer Strik
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we describe the language resources developed within the project Feedback and the Acquisition of Syntax in Oral Proficiency (FASOP), which is aimed at investigating the effectiveness of various forms of practice and feedback on the acquisition of syntax in second language (L2) oral proficiency, as well as their interplay with learner characteristics such as education level, learner motivation and confidence. For this purpose, use is made of a Computer Assisted Language Learning (CALL) system that employs Automatic Speech Recognition (ASR) technology to allow spoken interaction and to create an experimental environment that guarantees as much control over the language learning setting as possible. The focus of the present paper is on the resources that are being produced in FASOP. In line with the theme of this conference, we present the different types of resources developed within this project and the way in which these could be used to pursue innovative research in second language acquisition and to develop and improve ASR-based language learning applications.

2012

pdf bib abs

The DISCO ASR-based CALL system: practicing L2 oral skills and beyond
Helmer Strik | Jozef Colpaert | Joost van Doremalen | Catia Cucchiarini
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we describe the research that was carried out and the resources that were developed within the DISCO (Development and Integration of Speech technology into COurseware for language learning) project. This project aimed at developing an ASR-based CALL system that automatically detects pronunciation and grammar errors in Dutch L2 speaking and generates appropriate, detailed feedback on the errors detected. We briefly introduce the DISCO system and present its design, architecture and speech recognition modules. We then describe a first evaluation of the complete DISCO system and present some results. The resources generated through DISCO are subsequently described together with possible ways of efficiently generating additional resources in the future.

2010

pdf bib abs

Human Language Technology and Communicative Disabilities: Requirements and Possibilities for the Future
Marina B. Ruiter | Toni C. M. Rietveld | Catia Cucchiarini | Emiel J. Krahmer | Helmer Strik
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

For some years now, the Nederlandse Taalunie (Dutch Language Union) has been active in promoting the development of human language technology (HLT) applications for users of Dutch with communication disabilities. The reason is that HLT products and services may enable these users to improve their verbal autonomy and communication skills. We sought to identify a minimum common set of HLT resources that is required to develop tools for a wide range of communication disabilities. In order to reach this goal, we investigated the specific HLT needs of communicatively disabled people and related these needs to the underlying HLT software components. By analysing the availability and quality of these essential HLT resources, we were able to identify which of the crucial elements need further research and development to become usable for developing applications for communicatively disabled users of Dutch. The results obtained in the current survey can be used to inform policy institutions on how they can stimulate the development of HLT resources for this target group. In the current study results were obtained for Dutch, but a similar approach can also be used for other languages.

2008

pdf bib abs

The Dutch-Flemish Comprehensive Approach to HLT Stimulation and Innovation: STEVIN, HLT Agency and beyond
Peter Spyns | Elisabeth D’Halleweyn | Catia Cucchiarini
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper shows how a research and industry stimulation programme on human language technologies (HLT) for Dutch can be enhanced with more specific innovation policy aspects to support the take-up by the HLT industry in the Netherlands and Flanders. Important to note is the distinction between the HLT programme itself (called STEVIN) with its specific related committees and actions and the overall policy instruments (HLT Agency, HLT steering board?) that try to span the entire domain of HLT for Dutch and have a more permanent character. The establishment of a pricing committee and a PR & communication working group is explained as a consequence of adopting the notion of innovation system as a theoretical framework. It means that a stronger emphasis is put on improving knowledge transfer and exchange amongst actors in the field. Therefore, the focus at the programme management level is shifting from the projects research activities producing results to gathering the results, making them available at a certain cost and advertising them through the appropriate channels to the appropriate potential customers. Our conclusion is that this policy stimulates the transfer from academia to industry though it is too soon for an in-depth assessment of the STEVIN programme and other HLT innovation policy instruments.

pdf bib abs

Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: the JASMIN-CGN Corpus.
Catia Cucchiarini | Joris Driesen | Hugo Van hamme | Eric Sanders
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Within the framework of the Dutch-Flemish programme STEVIN, the JASMIN-CGN (Jongeren, Anderstaligen en Senioren in Mens-machine Interactie Corpus Gesproken Nederlands) project was carried out, which was aimed at collecting speech of children, non-natives and elderly people. The JASMIN-CGN project is an extension of the Spoken Dutch Corpus (CGN) along three dimensions. First, by collecting a corpus of contemporary Dutch as spoken by children of different age groups, elderly people and non-natives with different mother tongues, an extension along the age and mother tongue dimensions was achieved. In addition, we collected speech material in a communication setting that was not envisaged in the CGN: human-machine interaction. One third of the data was collected in Flanders and two thirds in the Netherlands. In this paper we report on our experiences in collecting this corpus and we describe some of the important decisions that we made in the attempt to combine efficiency and high quality.

2006

pdf bib abs

JASMIN-CGN: Extension of the Spoken Dutch Corpus with Speech of Elderly People, Children and Non-natives in the Human-Machine Interaction Modality
Catia Cucchiarini | Hugo Van hamme | Olga van Herwijnen | Felix Smits
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Large speech corpora (LSC) constitute an indispensable resource for conducting research in speech processing and for developing real-life speech applications. In 2004 the Spoken Dutch Corpus (CGN) became available, a corpus of standard Dutch as spoken by adult natives in the Netherlands and Flanders. Owing to budget constraints, CGN does not include speech of children, non-natives, elderly people and recordings of speech produced in human-machine interactions. Since such recordings would be extremely useful for conducting research and for developing HLT applications for these specific groups of speakers of Dutch, a new project, JASMIN-CGN, was started which aims at extending CGN in different ways: by collecting a corpus of contemporary Dutch as spoken by children of different age groups, non-natives with different mother tongues and elderly people in the Netherlands and Flanders and, in addition, by collecting speech material in a communication setting that was not envisaged in CGN: human-machine interaction. We expect that the knowledge gathered from these data can be generalized to developing appropriate systems also for other speaker groups (i.e. adult natives). One third of the data will be collected in Flanders and two thirds in the Netherlands.

pdf bib abs

The Dutch-Flemish HLT Programme STEVIN: Essential Speech and Language Technology Resources
Elisabeth D’Halleweyn | Jan Odijk | Lisanne Teunissen | Catia Cucchiarini
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In 2004 a consortium of ministries and organizations in the Netherlands and Flanders launched the comprehensive Dutch-Flemish HLT programme STEVIN (a Dutch acronym for Essential Speech and Language Technology Resources). To guarantee its Dutch-Flemish character, this large-scale programme is carried out under the auspices of the intergovernmental Dutch Language Union (NTU). The aim of STEVIN is to contribute to the further progress of HLT for the Dutch language, by raising awareness of HLT results, stimulating the demand of HLT products, promoting strategic research in HLT, and developing HLT resources that are essential and are known to be missing. Furthermore, a structure was set up for the management, maintenance and distribution of HLT resources. The STEVIN programme, which will run from 2004 to 2009, resulted from HLT activities in the Dutch language area, which were reported on at previous LREC conferences (2000, 2002, 2004). In this paper we will explain how different activities are combined in one comprehensive programme. We will show how cooperation can successfully be realized between different parties (language and speech technology, Flanders and the Netherlands, academia, industry and policy institutions) so as to achieve one common goal: progress in HLT.