Elisabetta Ježek - ACL Anthology

Elisabetta Ježek

Also published as: Elisabetta Jezek

2025

Subjectivity in Stereotypes against Migrants in Italian: An Experimental Annotation Procedure
Soda Marem Lo | Marco Antonio Stranisci | Alessandra Teresa Cignarella | Simona Frenda | Valerio Basile | Elisabetta Jezek | Viviana Patti
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

Investigating Proactivity in Task-Oriented Dialogues
Sofia Brenna | Elisabetta Jezek | Bernardo Magnini
Dialogue Discourse Volume 16

This paper investigates proactivity, a characteristic phenomenon of collaborative human-human interaction, where a participant in the dialogue offers the addressee some useful and not explicitly requested information. More precisely, a proactive behaviour is: (i) self-prompted and not simply reactive, that is, the speaker does not act merely in response to the requests the other participant has made; (ii) somehow effective for the achievement of the dialogue goal, since the speaker has a long-term, goal-directed behaviour that predicts future states and needs. Proactivity has been poorly investigated from a theoretical point of view, and there is a general need of empirical data for both quantitative and qualitative research. The paper provides an extensive analysis of proactivity in several human-human task-oriented dialogic corpora, selected with different characteristics, including chat exchanges and telephone calls, collection modalities such as natural setting and Wizard of Oz, and two languages, Italian and English. The main result is the D-Pro Corpus, a new resource manually annotated at the utterance level with proactivity and dialogue acts, which allows to investigate proactivity in the context of task-oriented dialogues. There are several findings from our empirical investigation of proactivity: (i) we find that about 20% of turns in our corpus are proactive turns, showing that this is a very diffused and relevant phenomenon; (ii) we confirm the non-reactive nature of proactivity, highlighting the presence of a pattern where a turn in the dialogue triggers a reaction in a following turn and a proactive utterance is then added to the turn; (iii) we show that only a limited number of dialogue acts are actually involved in expressing proactivity, and we discuss the theoretical implications of this finding; (iv) we empirically confirm that proactivity has a crucial role in recovering from goal-failure situations, contributing to the effectiveness of the whole dialogue; (v) we support the intuition of a non-uniform distribution of proactive utterances throughout the dialogue. Our empirical findings and the D-Pro Corpus provide relevant insights for deeper theoretical investigations, as well as crucial resources for improving proactivity in current task-oriented dialogue systems.

Preface
Cristina Bosco | Elisabetta Jezek | Marco Polignano | Manuela Sanguinetti
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
Cristina Bosco | Elisabetta Jezek | Marco Polignano | Manuela Sanguinetti
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

2024

Subcategorization of Italian Verbs with LLMs and T-PAS
Luca Simonetti | Elisabetta Jezek | Guido Vetere
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

This study explores the application of Large Language Models (LLMs) to verb subcategorization in Italian, focusing on the identification and classification of syntactic patterns in sentences. While LLMs have made lexical analysis more implicit, explicit argument structure identification remains crucial in domain-specific contexts. The research leverages T-PAS, a rich lexical resource for Italian verbs, to fine-tune the open multilingual model Mistral 7B using the Iterative Reasoning Preference Optimization (IRPO) technique. This approach aims to enhance the recognition and extraction of verbal patterns from Italian sentences, addressing challenges in resource quality, coverage, and frame extraction methods. By combining curated lexical-semantic resources with neural language models, this work contributes to improving verb subcategorization tasks, particularly for the Italian language, and demonstrates the potential of LLMs in refining linguistic analysis tools.

What to Annotate: Retrieving Lexical Markers of Conspiracy Discourse from an Italian-English Corpus of Telegram Data
Costanza Marini | Elisabetta Jezek
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024

In this age of social media, Conspiracy Theories (CTs) have become an issue that can no longer be ignored. After providing an overview of CT literature and corpus studies, we describe the creation of a 40,000-token English-Italian bilingual corpus of conspiracy-oriented Telegram comments – the Complotto corpus – and the linguistic analysis we performed using the Sketch Engine online platform (Kilgarriff et al., 2010) on our annotated data to identify statistically relevant linguistic markers of CT discourse. Thanks to the platform’s keywords and key terms extraction functions, we were able to assess the statistical significance of the following lexical and semantic phenomena, both cross-linguistically and cross-CT, namely: (1) evidentiality and epistemic modality markers; (2) debunking vocabulary referring to another version of the truth lying behind the official one; (3) the conceptual metaphor INSTITUTIONS ARE ABUSERS. All these features qualify as markers of CT discourse and have the potential to be effectively used for future semantic annotation tasks to develop automatic systems for CT identification.

2023

Identifying Semantic Argument Types in Predication and Copredication Contexts: A Zero-Shot Cross-Lingual Approach
Deniz Ekin Yavas | Laura Kallmeyer | Rainer Osswald | Elisabetta Jezek | Marta Ricchiardi | Long Chen
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Identifying semantic argument types in predication contexts is not a straightforward task for several reasons, such as inherent polysemy, coercion, and copredication phenomena. In this paper, we train monolingual and multilingual classifiers with a zero-shot cross-lingual approach to identify semantic argument types in predications using pre-trained language models as feature extractors. We train classifiers for different semantic argument types and for both verbal and adjectival predications. Furthermore, we propose a method to detect copredication using these classifiers through identifying the argument semantic type targeted in different predications over the same noun in a sentence. We evaluate the performance of the method on copredication test data with Food•Event nouns for 5 languages.

Why Don’t You Do It Right? Analysing Annotators’ Disagreement in Subjective Tasks
Marta Sandri | Elisa Leonardelli | Sara Tonelli | Elisabetta Jezek
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Annotators’ disagreement in linguistic data has been recently the focus of multiple initiatives aimed at raising awareness on issues related to ‘majority voting’ when aggregating diverging annotations. Disagreement can indeed reflect different aspects of linguistic annotation, from annotators’ subjectivity to sloppiness or lack of enough context to interpret a text. In this work we first propose a taxonomy of possible reasons leading to annotators’ disagreement in subjective tasks. Then, we manually label part of a Twitter dataset for offensive language detection in English following this taxonomy, identifying how the different categories are distributed. Finally we run a set of experiments aimed at assessing the impact of the different types of disagreement on classification performance. In particular, we investigate how accurately tweets belonging to different categories of disagreement can be classified as offensive or not, and how injecting data with different types of disagreement in the training set affects performance. We also perform offensive language detection as a multi-task framework, using disagreement classification as an auxiliary task.

Towards an Italian Corpus for Implicit Object Completion
Agnese Daffara | Elisabetta Jezek
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

2022

Annotating Propositional Attitude Verbs and their Arguments
Marta Ricchiardi | Elisabetta Jezek
Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022

This paper describes the results of an empirical study on attitude verbs and propositional attitude reports in Italian. Within the framework of a project aiming at acquiring argument structures for Italian verbs from corpora, we carried out a systematic annotation that aims at individuating which verbs are actually attitude verbs in Italian. The result is a list of 179 argument structures based on corpus-derived pattern of use for 126 verbs that behave as attitude verbs. The distribution of these verbs in the corpus suggests that not only the canonical that-clauses, i.e. subordinates introduced by the complementizerte che, but also direct speech, infinitives introduced by the complementizer di, and some nominals are good candidates to express propositional contents in propositional attitude reports. The annotation also enlightens some issues between semantics and ontology, concerning the relation between events and propositions.

2021

T-PAS Scraper: An Application for Linguistic Data Extraction and Analysis
Emma Romani | Valerio Gattero | Elisabetta Jezek
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

ConteCorpus: An Analysis of People Response to Institutional Communications During the Pandemic
Viviana Ventura | Elisabetta Jezek
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

2020

Tracing Metonymic Relations in T-PAS: An Annotation Exercise on a Corpus-based Resource for Italian
Emma Romani | Elisabetta Ježek
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

Clustering verbal Objects: Manual and Automatic Procedures Compared
Ilaria Colucci | Elisabetta Ježek | Vít Baisa
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

Annotating Croatian Semantic Type Coercions in CROATPAS
Costanza Marini | Elisabetta Jezek
Proceedings of the 16th Joint ACL-ISO Workshop on Interoperable Semantic Annotation

This short research paper presents the results of a corpus-based metonymy annotation exercise on a sample of 101 Croatian verb entries – corresponding to 457 patters and over 20,000 corpus lines – taken from CROATPAS (Marini & Ježek, 2019), a digital repository of verb argument structures manually annotated with Semantic Type labels on their argument slots following a methodology inspired by Corpus Pattern Analysis (Hanks, 2004 & 2013; Hanks & Pustejovsky, 2005). CROATPAS will be made available online in 2020. Semantic Type labelling is not only well-suited to annotate verbal polysemy, but also metonymic shifts in verb argument combinations, which in Generative Lexicon (Pustejovsky, 1995 & 1998; Pustejovsky & Ježek, 2008) are called Semantic Type coercions. From a sub lexical point of view, Semantic Type coercions can be considered as exploitations of one of the qualia roles of those Semantic Types which do not satisfy a verb’s selectional requirements, but do not trigger a different verb sense. Overall, we were able to identify 62 different Semantic Type coercions linked to 1,052 metonymic corpus lines. In the future, we plan to compare our results with those from an equivalent study on Italian verbs (Romani, 2020) for a crosslinguistic analysis of metonymic shifts.

2019

CROATPAS: A Resource of Corpus-derived Typed Predicate Argument Structures for Croatian
Costanza Marini | Elisabetta Jezek
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

A Distributional Model of Affordances in Semantic Type Coercion
Stephen McGregor | Elisabetta Jezek
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

We explore a novel application for interpreting semantic type coercions, motivated by insight into the role that perceptual affordances play in the selection of the semantic roles of artefactual nouns which are observed as arguments for verbs which would stereotypically select for objects of a different type. In order to simulate affordances, which we take to be direct perceptions of context-specific opportunities for action, we preform a distributional analysis dependency relationships between target words and their modifiers and adjuncts. We use these relationships as the basis for generating on-line transformations which project semantic subspaces in which the interpretations of coercive compositions are expected to emerge as salient word-vectors. We offer some preliminary examples of how this model operates on a dataset of sentences involving coercive interactions between verbs and objects specifically designed to evaluate this work.

2018

Lexical Opposition in Discourse Contrast
Anna Feltracco | Bernardo Magnini | Elisabetta Jezek
Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)

Distributional Analysis of Verbal Neologisms: Task Definition and Dataset Construction
Matteo Amore | Stephen McGregor | Elisabetta Jezek
Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)

Enriching a Lexicon of Discourse Connectives with Corpus-based Data
Anna Feltracco | Elisabetta Jezek | Bernardo Magnini
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

Dynamic Argument Structure
Elisabetta Jezek
Linguistic Issues in Language Technology, Volume 15, 2017

This paper presents a new classification of verbs of change and modification, proposing a dynamic interpretation of the lexical semantics of the predicate and its arguments. Adopting the model of dynamic event structure proposed in Pustejovsky (2013), and extending the model of dynamic selection outlined in Pustejovsky and Jezek (2011), we define a verb class in terms of its Dynamic Argument Structure (DAS), a representation which encodes how the participants involved in the change behave as the event unfolds. We address how the logical resources and results of change predicates are realized syntactically, if at all, as well as how the exploitation of the resource results in the initiation or termination of a new object, i.e. the result. We show how DAS can be associated with a dynamically encoded event structure representation, which measures the change making reference to a scalar component, modelled in terms of assignment and/or testing of values of attributes of participants.

A Geometric Method for Detecting Semantic Coercion
Stephen McGregor | Elisabetta Jezek | Matthew Purver | Geraint Wiggins
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Long papers

Tagging Semantic Types for Verb Argument Positions
Francesca Della Moretta | Anna Feltracco | Elisabetta Jezek | Bernardo Magnini
Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)

Contrast-Ita Bank: A corpus for Italian Annotated with Discourse Contrast Relations
Anna Feltracco | Bernardo Magnini | Elisabetta Jezek
Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)

2016

Acquiring Opposition Relations among Italian Verb Senses using Crowdsourcing
Anna Feltracco | Simone Magnolini | Elisabetta Jezek | Bernardo Magnini
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe an experiment for the acquisition of opposition relations among Italian verb senses, based on a crowdsourcing methodology. The goal of the experiment is to discuss whether the types of opposition we distinguish (i.e. complementarity, antonymy, converseness and reversiveness) are actually perceived by the crowd. In particular, we collect data for Italian by using the crowdsourcing platform CrowdFlower. We ask annotators to judge the type of opposition existing among pairs of sentences -previously judged as opposite- that differ only for a verb: the verb in the first sentence is opposite of the verb in second sentence. Data corroborate the hypothesis that some opposition relations exclude each other, while others interact, being recognized as compatible by the contributors.

Using WordNet to Build Lexical Sets for Italian Verbs
Anna Feltracco | Lorenzo Gatti | Elisabetta Jezek | Bernardo Magnini | Simone Magnolini
Proceedings of the 8th Global WordNet Conference (GWC)

We present a methodology for building lexical sets for argument slots of Italian verbs. We start from an inventory of semantically typed Italian verb frames and through a mapping to WordNet we automatically annotate the sets of fillers for the argument positions in a corpus of sentences. We evaluate both a baseline algorithm and a syntax driven algorithm and show that the latter performs significantly better in terms of precision.

2015

Opposition Relations among Verb Frames
Anna Feltracco | Elisabetta Jezek | Bernardo Magnini
Proceedings of the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation

Corpus Patterns for Semantic Processing
Octavian Popescu | Patrick Hanks | Elisabetta Jezek | Daisuke Kawahara
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing: Tutorial Abstracts

Instrument subjects without Instrument role
Elisabetta Ježek | Rossella Varvara
Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11)

2014

T-PAS; A resource of Typed Predicate Argument Structures for linguistic analysis and semantic processing
Elisabetta Jezek | Bernardo Magnini | Anna Feltracco | Alessia Bianchini | Octavian Popescu
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The goal of this paper is to introduce T-PAS, a resource of typed predicate argument structures for Italian, acquired from corpora by manual clustering of distributional information about Italian verbs, to be used for linguistic analysis and semantic processing tasks. T-PAS is the first resource for Italian in which semantic selection properties and sense-in-context distinctions of verbs are characterized fully on empirical ground. In the paper, we first describe the process of pattern acquisition and corpus annotation (section 2) and its ongoing evaluation (section 3). We then demonstrate the benefits of pattern tagging for NLP purposes (section 4), and discuss current effort to improve the annotation of the corpus (section 5). We conclude by reporting on ongoing experiments using semiautomatic techniques for extending coverage (section 6).

2013

Sweetening Ontologies cont’d
Elisabetta Jezek
Proceedings of the Joint Symposium on Semantic Processing. Textual Inference and Structures in Corpora

2012

Annotating Qualia Relations in Italian and French Complex Nominals
Pierrette Bouillon | Elisabetta Jezek | Chiara Melloni | Aurélie Picton
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The goal of this paper is to provide an annotation scheme for compounds based on generative lexicon theory (GL, Pustejovsky, 1995; Bassac and Bouillon, 2001). This scheme has been tested on a set of compounds automatically extracted from the Europarl corpus (Koehn, 2005) both in Italian and French. The motivation is twofold. On the one hand, it should help refine existing compound classifications and better explain lexicalization in both languages. On the other hand, we hope that the extracted generalizations can be used in NLP, for example for improving MT systems or for query reformulation (Claveau, 2003). In this paper, we focus on the annotation scheme and its on going evaluation.

2011

Senso Comune, an Open Knowledge Base of Italian Language
Guido Vetere | Alessandro Oltramari | Isabella Chiari | Elisabetta Jezek | Laure Vieu | Fabio Massimo Zanzotto
Traitement Automatique des Langues, Volume 52, Numéro 3 : Ressources linguistiques libres [Free Language Resources]

2010

SemEval-2010 Task 7: Argument Selection and Coercion
James Pustejovsky | Anna Rumshisky | Alex Plotnick | Elisabetta Jezek | Olga Batiukova | Valeria Quochi
Proceedings of the 5th International Workshop on Semantic Evaluation

Capturing Coercions in Texts: a First Annotation Exercise
Elisabetta Jezek | Valeria Quochi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we report the first results of an annotation exercise of argument coercion phenomena performed on Italian texts. Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al., 2008). For the purposes of coercion annotation, we selected 26 Italian verbs that impose semantic typing on their arguments in either Subject, Direct Object or Complement position. Every sentence of the corpus is annotated with the source type for the noun arguments by two annotators plus a judge. An overall agreement of 0.87 kappa indicates that the annotation methodology is reliable. A qualitative analysis of the results allows us to outline some suggestions for improvement of the task: 1) a different account of complex types for nouns has to be devised and 2) a more comprehensive account of coercion mechanisms requires annotation of the deeper meaning dimensions that are targeted in coercion operations, such as those captured by Qualia relations.

Co-authors

Venues