INVALSI - Mathematical and Language Understanding in Italian: A CALAMITA Challenge

Giovanni Puccetti, Maria Cassese, Andrea Esuli


Abstract
While Italian is a high resource language, there are few Italian-native benchmarks to evaluate Language Models (LMs) generative abilities in this language. This work presents two new benchmarks: Invalsi MATE to evaluate models performance on mathematical understanding in Italian and Invalsi ITA to evaluate language understanding in Italian.These benchmarks are based on the Invalsi tests, which are administered to students of age between 6 and 18 within the Italian school system. These tests are prepared by expert pedagogists and have the explicit goal of testing average students’ performance over time across Italy. Therefore, the questions are well written, appropriate for the age of the students, and are developed with the goal of assessing students’ skills that are essential in the learning process, ensuring that the benchmark proposed here measures key knowledge for undergraduate students.Invalsi MATE is composed of 420 questions about mathematical understanding, these questions range from simple money counting problems to Cartesian geometry questions, e.g. determining if a point belongs to a given line. They are divided into 4 different types: scelta multipla (multiple choice), vero/falso (true/false), numero (number), completa frase (fill the gap). Invalsi ITA is composed of 1279 questions regarding language understanding, these questions involve both the ability to extract information and answer questions about a text passage as well as questions about grammatical knowledge. They are divided into 4 different types: scelta multipla (multiple choice), binaria (binary), domanda aperta (open question) and altro (other).We evaluate 4 powerful language models both English-first and tuned for Italian to see that best accuracy on Invalsi MATE is 55% while best accuracy on Invalsi ITA is 80%.
Anthology ID:
2024.clicit-1.129
Volume:
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:
December
Year:
2024
Address:
Pisa, Italy
Editors:
Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:
CLiC-it
SIG:
Publisher:
CEUR Workshop Proceedings
Note:
Pages:
1168–1175
Language:
URL:
https://aclanthology.org/2024.clicit-1.129/
DOI:
Bibkey:
Cite (ACL):
Giovanni Puccetti, Maria Cassese, and Andrea Esuli. 2024. INVALSI - Mathematical and Language Understanding in Italian: A CALAMITA Challenge. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 1168–1175, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):
INVALSI - Mathematical and Language Understanding in Italian: A CALAMITA Challenge (Puccetti et al., CLiC-it 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clicit-1.129.pdf