Marcello Ferro


2024

pdf bib
Assessing Reading Literacy of Bulgarian Pupils with Finger–tracking
Alessandro Lento | Andrea Nadalini | Marcello Ferro | Claudia Marzi | Vito Pirrelli | Tsvetana Dimitrova | Hristina Kukova | Valentina Stefanova | Maria Todorova | Svetla Koeva
Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024)

The paper reports on the first steps in developing a time-stamped multimodal dataset of reading data by Bulgarian children. Data are being collected, structured and analysed by means of ReadLet, an innovative infrastructure for multimodal language data collection that uses a tablet as a reader’s front-end. The overall goal of the project is to quantitatively analyse the reading skills of a sample of early Bulgarian readers collected over a two-year period, and compare them with the reading data of early readers of Italian, collected using the same protocol. We illustrate design issues of the experimental protocol, as well as the data acquisition process and the post-processing phase of data annotation/augmentation. To evaluate the potential and usefulness of the Bulgarian dataset for reading research, we present some preliminary statistical analyses of our recently collected data. They show robust convergence trends between Bulgarian and Italian early reading development stages.

pdf bib
ReadLet: A Dataset for Oral, Visual and Tactile Text Reading Data of Early and Mature Readers
Marcello Ferro | Claudia Marzi | Andrea Nadalini | Loukia Taxitari | Alessandro Lento | Vito Pirrelli
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The paper presents the design and construction of a time-stamped multimodal dataset for reading research, including multiple time-aligned temporal signals elicited with four experimental trials of connected text reading by both child and adult readers. We present the experimental protocols, as well as the data acquisition process and the post-processing phase of data annotation/augmentation. To evaluate the potential and usefulness of a time-aligned multimodal dataset for reading research, we present a few statistical analyses showing the correlation and complementarity of multimodal time-series of reading data, as well as some results of modelling adults’ reading data by integrating different modalities. The total dataset size amounts to about 2.5 GByte in compressed format.

2018

pdf bib
Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Claudia Marzi | Marcello Ferro | Ouafae Nahli | Patrizia Belik | Stavros Bompolas | Vito Pirrelli
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2012

pdf bib
Evaluating Hebbian Self-Organizing Memories for Lexical Representation and Access
Claudia Marzi | Marcello Ferro | Claudia Caudai | Vito Pirrelli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The lexicon is the store of words in long-term memory. Any attempt at modelling lexical competence must take issues of string storage seriously. In the present contribution, we discuss a few desiderata that any biologically-inspired computational model of the mental lexicon has to meet, and detail a multi-task evaluation protocol for their assessment. The proposed protocol is applied to a novel computational architecture for lexical storage and acquisition, the """"Topological Temporal Hebbian SOMs"""" (T2HSOMs), which are grids of topologically organised memory nodes with dedicated sensitivity to time-bound sequences of letters. These maps can provide a rigorous and testable conceptual framework within which to provide a comprehensive, multi-task protocol for testing the performance of Hebbian self-organising memories, and a comprehensive picture of the complex dynamics between lexical processing and the acquisition of morphological structure.