Luca Brigada Villa

Also published as: Luca Brigada Villa

2026

This paper presents an annotation scheme developed to analyze linguisticaccessibility and inclusivity in Italian cancer-related informational materials.The scheme combines metadata annotation, qualitative analysis of textual andvisual features, and automatically extracted measures of linguistic complexitycapturing structural, lexical, and probabilistic properties of the texts. Abrief case study demonstrates how the proposed framework can be applied tocompare documents and identify different sources of linguistic difficulty. Theapproach provides a replicable methodological basis for large-scale analyses ofhealth communication materials.

2025

pdf bib

pdf bib

pdf bib

MakeItSample: A Python Library for Generating Typological Language Samples Based on the Diversity Value Metric
Luca Brigada Villa
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

2024

pdf bib abs

From YCOE to UD: Rule-based Root Identification in Old English
Luca Brigada Villa | Martina Giarda
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024

In this paper we apply a set of rules to identify the root of a dependency tree, following the Universal Dependencies formalism and starting from the constituency annotation of the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE). This rule-based root-identification task represents the first step towards a rule-based automatic conversion of this valuable resource into the UD format. After presenting Old English and the annotated resources available for this language, we describe the different rules we applied and then we discuss the results and the errors.

2023

pdf bib

LSDT: a Dependency Treebank of Lombard Sinti
Marco Forlano | Luca Brigada Villa
Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib abs

Using Modern Languages to Parse Ancient Ones: a Test on Old English
Luca Brigada Villa | Martina Giarda
Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP

In this paper we test the parsing performances of a multilingual parser on Old English data using different sets of languages, alone and combined with the target language, to train the models. We compare the results obtained by the models and we analyze more in deep the annotation of some peculiar syntactic constructions of the target language, providing plausible linguistic explanations of the errors made even by the best performing models.

pdf bib abs

Combining WordNets with Treebanks to study idiomatic language: A pilot study on Rigvedic formulas through the lenses of the Sanskrit WordNet and the Vedic Treebank
Luca Brigada Villa | Erica Biagetti | Riccardo Ginevra | Chiara Zanchi
Proceedings of the 12th Global Wordnet Conference

This paper shows how WordNets can be employed in tandem with morpho-syntactically annotated corpora to study poetic formulas. Pairing the lexico-semantic information of the Sanskrit WordNet with morpho-syntactic annotation from the Vedic Treebank, we perform a pilot study of formulas including SPEECH verbs in the RigVeda, the most ancient text of the. Sanskrit literature.

2022

pdf bib abs

Annotating “Absolute” Preverbs in the Homeric and Vedic Treebanks
Luca Brigada Villa | Erica Biagetti | Chiara Zanchi
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

Indo-European preverbs are uninflected morphemes attaching to verbs and modifying their meaning. In Early Vedic and Homeric Greek, these morphemes held ambiguous morphosyntactic status raising issues for syntactic annotation. This paper focuses on the annotation of preverbs in so-called “absolute” position in two Universal Dependencies treebanks. This issue is related to the broader topic of how to annotate ellipsis in Universal Dependencies. After discussing some of the current annotations, we propose a new scheme that better accounts for the variety of absolute constructions.

pdf bib abs

UDeasy: a Tool for Querying Treebanks in CoNLL-U Format
Luca Brigada Villa
Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10)

Many tools are available to query a dependency treebank, but they require the users to know a query language. In this paper I present UDeasy, an application whose main goal is to allow the users to easily query and extract patterns from a dependency treebank in CoNLL-U format.

2021

pdf bib abs

Inferring Morphological Complexity from Syntactic Dependency Networks: A Test
Guglielmo Inglese | Luca Brigada Villa
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

Research in linguistic typology has shown that languages do not fall into the neat morphological types (synthetic vs. analytic) postulated in the 19th century. Instead, analytic and synthetic must be viewed as two poles of a continuum and languages may show a mix analytic and synthetic strategies to different degrees. Unfortunately, empirical studies that offer a more fine-grained morphological classification of languages based on these parameters remain few. In this paper, we build upon previous research by Liu & Xu (2011) and investigate the possibility of inferring information on morphological complexity from syntactic dependency networks.