Flavio Massimiliano Cecchini

Also published as: Flavio Massimiliano Cecchini

2025

Quid verbumst? Applying a definition of word to Latin in Universal Dependencies
Flavio Massimiliano Cecchini
Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025)

Words, more specifically “syntactic words”, are at the centre of a dependency-based approach like Universal Dependencies. Nonetheless, its guidelines do not make explicit how such a word should be defined and identified, and so it happens that different treebanks use different standards to this end. To counter this vagueness, the community has been recently discussing a definition put forward in (Haspelmath, 2023) which is not fully uncontroversial. This contribution is a preliminary case study that tries its hand at concretely applying this definition (except for compounds) to Latin in order to gain more insights about its operability and groundedness. This is helped by the spread of Latin over many treebanks, the presence of good linguistic resources to analyse it, and a linguistic type which is probably not fully considered in (Haspelmath, 2023). On the side, this work shows once more the difficulties of turning theoretical definitions into working directives in the realm of linguistic annotation.

2023

pdf bib

Linking the Computational Historical Semantics corpus to the LiLa Knowledge Base of Interoperable Linguistic Resources for Latin
Giulia Pedonese | Flavio Massimiliano Cecchini | Marco Carlo Passarotti
Proceedings of the 4th Conference on Language, Data and Knowledge

pdf bib

Highway to Hell. Towards a Universal Dependencies Treebank for Dante Alighieri’s Comedy
Claudia Corbetta | Marco Passarotti | Flavio Massimiliano Cecchini | Giovanni Moretti
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

2022

pdf bib abs

A Treebank-based Approach to the Supprema Constructio in Dante’s Latin Works
Flavio Massimiliano Cecchini | Giulia Pedonese
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

This paper aims to apply a corpus-driven approach to Dante Alighieri’s Latin works using UDante, a treebank based on Dante Search and part of the Universal Dependencies project. We present a method based on the notion of barycentre applied to a dependency tree as a way to calculate the “syntactic balance” of a sentence. Its application to Dante’s Latin works shows its potential in analysing the style of an author, and contributes to the interpretation of the supprema constructio mentioned in DVE II vi 7 as a well balanced syntactic pattern modeled on Latin literary writing.

pdf bib abs

Overview of the EvaLatin 2022 Evaluation Campaign
Rachele Sprugnoli | Marco Passarotti | Flavio Massimiliano Cecchini | Margherita Fantoli | Giovanni Moretti
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

This paper describes the organization and the results of the second edition of EvaLatin, the campaign for the evaluation of Natural Language Processing tools for Latin. The three shared tasks proposed in EvaLatin 2022, i.,e.,Lemmatization, Part-of-Speech Tagging and Features Identification, are aimed to foster research in the field of language technologies for Classical languages. The shared dataset consists of texts mainly taken from the LASLA corpus. More specifically, the training set includes only prose texts of the Classical period, whereas the test set is organized in three sub-tasks: a Classical sub-task on a prose text of an author not included in the training data, a Cross-genre sub-task on poetic and scientific texts, and a Cross-time sub-task on a text of the 15th century. The results obtained by the participants for each task and sub-task are presented and discussed.

2021

pdf bib

Formae reformandae: for a reorganisation of verb form annotation in Universal Dependencies illustrated by the specific case of Latin
Flavio Massimiliano Cecchini
Proceedings of the Fifth Workshop on Universal Dependencies (UDW, SyntaxFest 2021)

pdf bib

The Annotation of Liber Abbaci, a Domain-Specific Latin Resource
Francesco Grotto | Rachele Sprugnoli | Margherita Fantoli | Maria Simi | Flavio Massimiliano Cecchini | Marco Passarotti
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

2020

pdf bib abs

Overview of the EvaLatin 2020 Evaluation Campaign
Rachele Sprugnoli | Marco Passarotti | Flavio Massimiliano Cecchini | Matteo Pellegrini
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages

This paper describes the first edition of EvaLatin, a campaign totally devoted to the evaluation of NLP tools for Latin. The two shared tasks proposed in EvaLatin 2020, i. e. Lemmatization and Part-of-Speech tagging, are aimed at fostering research in the field of language technologies for Classical languages. The shared dataset consists of texts taken from the Perseus Digital Library, processed with UDPipe models and then manually corrected by Latin experts. The training set includes only prose texts by Classical authors. The test set, alongside with prose texts by the same authors represented in the training set, also includes data relative to poetry and to the Medieval period. This also allows us to propose the Cross-genre and Cross-time subtasks for each task, in order to evaluate the portability of NLP tools for Latin across different genres and time periods. The results obtained by the participants for each task and subtask are presented and discussed.

pdf bib

UDante: First Steps Towards the Universal Dependencies Treebank of Dante’s Latin Works
Flavio Massimiliano Cecchini | Rachele Sprugnoli | Giovanni Moretti | Marco Passarotti
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib abs

A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages
Flavio Massimiliano Cecchini | Timo Korkiakangas | Marco Passarotti
Proceedings of the Twelfth Language Resources and Evaluation Conference

The present work introduces a new Latin treebank that follows the Universal Dependencies (UD) annotation standard. The treebank is obtained from the automated conversion of the Late Latin Charter Treebank 2 (LLCT2), originally in the Prague Dependency Treebank (PDT) style. As this treebank consists of Early Medieval legal documents, its language variety differs considerably from both the Classical and Medieval learned varieties prevalent in the other currently available UD Latin treebanks. Consequently, besides significant phenomena from the perspective of diachronic linguistics, this treebank also poses several challenging technical issues for the current and future syntactic annotation of Latin in the UD framework. Some of the most relevant cases are discussed in depth, with comparisons between the original PDT and the resulting UD annotations. Additionally, an overview of the UD-style structure of the treebank is given, and some diachronic aspects of the transition from Latin to Romance languages are highlighted.

2018

pdf bib

pdf bib abs

Challenges in Converting the Index Thomisticus Treebank into Universal Dependencies
Flavio Massimiliano Cecchini | Marco Passarotti | Paola Marongiu | Daniel Zeman
Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)

This paper describes the changes applied to the original process used to convert the Index Thomisticus Treebank, a corpus including texts in Medieval Latin by Thomas Aquinas, into the annotation style of Universal Dependencies. The changes are made both to harmonise the Universal Dependencies version of the Index Thomisticus Treebank with the two other available Latin treebanks and to fix errors and inconsistencies resulting from the original process. The paper details the treatment of different issues in PoS tagging, lemmatisation and assignment of dependency relations. Finally, it assesses the quality of the new conversion process by providing an evaluation against a gold standard.