Chiara Zanchi


2023

pdf bib
Linking the Sanskrit WordNet to the Vedic Dependency Treebank: a pilot study
Erica Biagetti | Chiara Zanchi | Silvia Luraghi
Proceedings of the 12th Global Wordnet Conference

The Sanskrit WordNet is a resource currently under development, whose core was induced from a Vedic text sample semantically annotated by means of an ontology mapped on the Princeton WordNet synsets. Building on a previous case study on Ancient Greek (Zanchi et al. 2021), we show how sentence frames can be extracted from morphosyntactically parsed corpora by linking an existing dependency treebank of Vedic Sanskrit to verbal synsets in the Sanskrit WordNet. Our case study focuses on two verbs of asking, yāc- and prach-, featuring a high degree of variability in sentence frames. Treebanks enhanced with WordNet-based semantic information revealed to be of crucial help in motivating sentence frame alternations.

pdf bib
Combining WordNets with Treebanks to study idiomatic language: A pilot study on Rigvedic formulas through the lenses of the Sanskrit WordNet and the Vedic Treebank
Luca Brigada Villa | Erica Biagetti | Riccardo Ginevra | Chiara Zanchi
Proceedings of the 12th Global Wordnet Conference

This paper shows how WordNets can be employed in tandem with morpho-syntactically annotated corpora to study poetic formulas. Pairing the lexico-semantic information of the Sanskrit WordNet with morpho-syntactic annotation from the Vedic Treebank, we perform a pilot study of formulas including SPEECH verbs in the RigVeda, the most ancient text of the. Sanskrit literature.

2022

pdf bib
PaVeDa - Pavia Verbs Database: Challenges and Perspectives
Chiara Zanchi | Silvia Luraghi | Claudia Roberta Combei
Proceedings of the 4th Workshop on Research in Computational Linguistic Typology and Multilingual NLP

This paper describes an ongoing endeavor to construct Pavia Verbs Database (PaVeDa) – an open-access typological resource that builds upon previous work on verb argument structure, in particular the Valency Patterns Leipzig (ValPaL) project (Hartmann et al., 2013). The PaVeDa database features four major innovations as compared to the ValPaL database: (i) it includes data from ancient languages enabling diachronic research; (ii) it expands the language sample to language families that are not represented in the ValPaL; (iii) it is linked to external corpora that are used as sources of usage-based examples of stored patterns; (iv) it introduces a new cross-linguistic layer of annotation for valency patterns which allows for contrastive data visualization.

pdf bib
Annotating “Absolute” Preverbs in the Homeric and Vedic Treebanks
Luca Brigada Villa | Erica Biagetti | Chiara Zanchi
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

Indo-European preverbs are uninflected morphemes attaching to verbs and modifying their meaning. In Early Vedic and Homeric Greek, these morphemes held ambiguous morphosyntactic status raising issues for syntactic annotation. This paper focuses on the annotation of preverbs in so-called “absolute” position in two Universal Dependencies treebanks. This issue is related to the broader topic of how to annotate ellipsis in Universal Dependencies. After discussing some of the current annotations, we propose a new scheme that better accounts for the variety of absolute constructions.

pdf bib
Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports
Gosse Minnema | Sara Gemelli | Chiara Zanchi | Tommaso Caselli | Malvina Nissim
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Different linguistic expressions can conceptualize the same event from different viewpoints by emphasizing certain participants over others. Here, we investigate a case where this has social consequences: how do linguistic expressions of gender-based violence (GBV) influence who we perceive as responsible? We build on previous psycholinguistic research in this area and conduct a large-scale perception survey of GBV descriptions automatically extracted from a corpus of Italian newspapers. We then train regression models that predict the salience of GBV participants with respect to different dimensions of perceived responsibility. Our best model (fine-tuned BERT) shows solid overall performance, with large differences between dimensions and participants: salient _focus_ is more predictable than salient _blame_, and perpetrators’ salience is more predictable than victims’ salience. Experiments with ridge regression models using different representations show that features based on linguistic theory similarly to word-based features. Overall, we show that different linguistic choices do trigger different perceptions of responsibility, and that such perceptions can be modelled automatically. This work can be a core instrument to raise awareness of the consequences of different perspectivizations in the general public and in news producers alike.

pdf bib
SocioFillmore: A Tool for Discovering Perspectives
Gosse Minnema | Sara Gemelli | Chiara Zanchi | Tommaso Caselli | Malvina Nissim
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

SOCIOFILLMORE is a multilingual tool which helps to bring to the fore the focus or the perspective that a text expresses in depicting an event. Our tool, whose rationale we also support through a large collection of human judgements, is theoretically grounded on frame semantics and cognitive linguistics, and implemented using the LOME frame semantic parser. We describe SOCIOFILLMORE’s development and functionalities, show how non-NLP researchers can easily interact with the tool, and present some example case studies which are already incorporated in the system, together with the kind of analysis that can be visualised.

2021

pdf bib
Toward the creation of WordNets for ancient Indo-European languages
Erica Biagetti | Chiara Zanchi | William Michael Short
Proceedings of the 11th Global Wordnet Conference

This paper presents the work in progress toward the creation of a family of WordNets for Sanskrit, Ancient Greek, and Latin. Building on previous attempts in the field, we elaborate these efforts bridging together WordNet relational semantics with theories of meaning from Cognitive Linguistics. We discuss some of the innovations we have introduced to the WordNet architecture, to better capture the polysemy of words, as well as Indo-European language family-specific features. We conclude the paper framing our work within the larger picture of resources available for ancient languages and showing that WordNet-backed search tools have the potential to re-define the kinds of questions that can be asked of ancient language corpora.