Sara Tonelli

2025

Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs
Camilla Casula | Sebastiano Vecellio Salto | Elisa Leonardelli | Sara Tonelli
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Disentangling how gender and occupations are encoded by LLMs is crucial to identify possible biases and prevent harms, especially given the widespread use of LLMs in sensitive domains such as human resources.In this work, we carry out an in-depth investigation of gender and occupational biases in English and Italian as expressed by 9 different LLMs (both base and instruction-tuned). Specifically, we focus on the analysis of sentence completions when LLMs are prompted with job-related sentences including different gender representations. We carry out a manual analysis of 4,500 generated texts over 4 dimensions that can reflect bias, we propose a novel embedding-based method to investigate biases in generated texts and, finally, we carry out a lexical analysis of the model completions. In our qualitative and quantitative evaluation we show that many facets of social bias remain unaccounted for even in aligned models, and LLMs in general still reflect existing gender biases in both languages. Finally, we find that models still struggle with gender-neutral expressions, especially beyond English.

pdf bib abs

ModaFact: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection
Marco Rovera | Serena Cristoforetti | Sara Tonelli
Proceedings of the 31st International Conference on Computational Linguistics

Factuality and modality are two crucial aspects concerning events, since they convey the speaker’s commitment to a situation in discourse as well as how this event is supposed to occur in terms of norms, wishes, necessity, duty and so on. Capturing them both is necessary to truly understand an utterance meaning and the speaker’s perspective with respect to a mentioned event. Yet, NLP studies have mostly dealt with these two aspects separately, mainly devoting past efforts to the development of English datasets. In this work, we propose ModaFact, a novel resource with joint factuality and modality information for event-denoting expressions in Italian. We propose a novel annotation scheme, which however is consistent with existing ones, and compare different classification systems trained on ModaFact, as a preliminary step to the use of factuality and modality information in downstream tasks. The dataset and the best-performing model are publicly released and available under an open license.

pdf bib

Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Gavin Abercrombie | Valerio Basile | Simona Frenda | Sara Tonelli | Shiran Dudy
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP

pdf bib abs

Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches
Alan Ramponi | Marco Rovera | Robert Moro | Sara Tonelli
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Retrieval of previously fact-checked claims is a well-established task, whose automation can assist professional fact-checkers in the initial steps of information verification. Previous works have mostly tackled the task monolingually, i.e., having both the input and the retrieved claims in the same language. However, especially for languages with a limited availability of fact-checks and in case of global narratives, such as pandemics, wars, or international politics, it is crucial to be able to retrieve claims across languages. In this work, we examine strategies to improve the multilingual and crosslingual performance, namely selection of negative examples (in the supervised) and re-ranking (in the unsupervised setting). We evaluate all approaches on a dataset containing posts and claims in 47 languages (283 language combinations). We observe that the best results are obtained by using LLM-based re-ranking, followed by fine-tuning with negative examples sampled using a sentence similarity-based strategy. Most importantly, we show that crosslinguality is a setup with its own unique characteristics compared to the multilingual setup.

pdf bib abs

Fine-grained Fallacy Detection with Human Label Variation
Alan Ramponi | Agnese Daffara | Sara Tonelli
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

We introduce FAINA, the first dataset for fallacy detection that embraces multiple plausible answers and natural disagreement. FAINA includes over 11K span-level annotations with overlaps across 20 fallacy types on social media posts in Italian about migration, climate change, and public health given by two expert annotators. Through an extensive annotation study that allowed discussion over multiple rounds, we minimize annotation errors whilst keeping signals of human label variation. Moreover, we devise a framework that goes beyond “single ground truth” evaluation and simultaneously accounts for multiple (equally reliable) test sets and the peculiarities of the task, i.e., partial span matches, overlaps, and the varying severity of labeling errors. Our experiments across four fallacy detection setups show that multi-task and multi-label transformer-based approaches are strong baselines across all settings. We release our data, code, and annotation guidelines to foster research on fallacy detection and human label variation more broadly.

pdf bib

WorthIt: Check-worthiness Estimation of Italian Social Media Posts
Agnese Daffara | Alan Ramponi | Sara Tonelli
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

pdf bib abs

ARG2ST at CQs-Gen 2025: Critical Questions Generation through LLMs and Usefulness-based Selection
Alan Ramponi | Gaudenzia Genoni | Sara Tonelli
Proceedings of the 12th Argument mining Workshop

Critical questions (CQs) generation for argumentative texts is a key task to promote critical thinking and counter misinformation. In this paper, we present a two-step approach for CQs generation that i) uses a large language model (LLM) for generating candidate CQs, and ii) leverages a fine-tuned classifier for ranking and selecting the top-k most useful CQs to present to the user. We show that such usefulness-based CQs selection consistently improves the performance over the standard application of LLMs. Our system was designed in the context of a shared task on CQs generation hosted at the 12th Workshop on Argument Mining, and represents a viable approach to encourage future developments on CQs generation. Our code is made available to the research community.

pdf bib abs

Multilingual Analysis of Narrative Properties in Conspiracist vs Mainstream Telegram Channels
Katarina Laken | Matteo Melis | Sara Tonelli | Marcos Garcia
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Conspiracist narratives posit an omnipotent, evil group causing harm throughout domains. However, modern-day online conspiracism is often more erratic, consisting of loosely connected posts displaying a general anti-establishment attitude pervaded by negative emotions. We gather a dataset of 300 conspiracist and mainstream, Telegram channels in Italian and English and use the automatic extraction of entities and emotion detection to compare structural characteristics of both types of channels. We create a co-occurrence network of entities to analyze how the different types of channels introduce and use them across posts and topics. We find that conspiracist channels are characterized by anger. Moreover, co-occurrence networks of entities appearing in conspiracist channels are more dense. We theorize that this reflects a narrative structure where all actants are pushed into a single domain. Conspiracist channels disproportionately associate the most central group of entities with anger and fear. We do not find evidence that entities in conspiracist narratives occur across more topics. This could indicate an erratic type of online conspiracism where everything can be connected to everything and that is characterized by a high number of entities and high levels of anger.

pdf bib

On the Impact of Hate Speech Synthetic Data on Model Fairness
Camilla Casula | Sara Tonelli
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

pdf bib

2024

pdf bib abs

Benchmarking the Semantics of Taste: Towards the Automatic Extraction of Gustatory Language
Teresa Paccosi | Sara Tonelli
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

In this paper, we present a benchmark containing texts manually annotated with gustatory semantic information. We employ a FrameNet-like approach previously tested to address olfactory language, which we adapt to capture gustatory events. We then propose an exploration of the data in the benchmark to show the possible insights brought by this type of approach, addressing the investigation of emotional valence in text genres. Eventually, we present a supervised system trained with the taste benchmark for the extraction of gustatory information from historical and contemporary texts.

pdf bib abs

Delving into Qualitative Implications of Synthetic Data for Hate Speech Detection
Camilla Casula | Sebastiano Vecellio Salto | Alan Ramponi | Sara Tonelli
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The use of synthetic data for training models for a variety of NLP tasks is now widespread. However, previous work reports mixed results with regards to its effectiveness on highly subjective tasks such as hate speech detection. In this paper, we present an in-depth qualitative analysis of the potential and specific pitfalls of synthetic data for hate speech detection in English, with 3,500 manually annotated examples. We show that, across different models, synthetic data created through paraphrasing gold texts can improve out-of-distribution robustness from a computational standpoint. However, this comes at a cost: synthetic data fails to reliably reflect the characteristics of real-world data on a number of linguistic dimensions, it results in drastically different class distributions, and it heavily reduces the representation of both specific identity groups and intersectional hate.

pdf bib abs

Putting Context in Context: the Impact of Discussion Structure on Text Classification
Nicolò Penzo | Antonio Longa | Bruno Lepri | Sara Tonelli | Marco Guerini
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Current text classification approaches usually focus on the content to be classified. Contextual aspects (both linguistic and extra-linguistic) are usually neglected, even in tasks based on online discussions. Still in many cases the multi-party and multi-turn nature of the context from which these elements are selected can be fruitfully exploited. In this work, we propose a series of experiments on a large dataset for stance detection in English, in which we evaluate the contribution of different types of contextual information, i.e. linguistic, structural and temporal, by feeding them as natural language input into a transformer-based model. We also experiment with different amounts of training data and analyse the topology of local discussion networks in a privacy-compliant way. Results show that structural information can be highly beneficial to text classification but only under certain circumstances (e.g. depending on the amount of training data and on discussion chain complexity). Indeed, we show that contextual information on smaller datasets from other classification tasks does not yield significant improvements. Our framework, based on local discussion networks, allows the integration of structural information while minimising user profiling, thus preserving their privacy.

pdf bib

Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024
Gavin Abercrombie | Valerio Basile | Davide Bernadi | Shiran Dudy | Simona Frenda | Lucy Havens | Sara Tonelli
Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024

pdf bib abs

TimeFrame: Querying and Visualizing Event Semantic Frames in Time
Davide Lamorte | Marco Rovera | Alfio Ferrara | Sara Tonelli
Proceedings of the First Workshop on Reference, Framing, and Perspective @ LREC-COLING 2024

In this work we introduce TimeFrame, an online platform to easily query and visualize events and participants extracted from document collections in Italian following a frame-based approach. The system allows users to select one or more events (frames) or event categories and to display their occurrences on a timeline. Different query types, from coarse to fine-grained, are available through the interface, enabling a time-bound analysis of large historical corpora. We present three use cases based on the full archive of news published in 1948 by the newspaper “Corriere della Sera”. We show that different crucial events can be explored, providing interesting insights into the narratives around such events, the main participants and their points of view.

pdf bib abs

A New Annotation Scheme for the Semantics of Taste
Teresa Paccosi | Sara Tonelli
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024

This paper introduces a new annotation scheme for the semantics of gustatory language in English, which builds upon a previous framework for olfactory language based on frame semantics. The purpose of this annotation framework is to be used for annotating comparable resources for the study of sensory language and to create training datasets for supervised systems aimed at extracting sensory information. Furthermore, our approach incorporates words from specific historical periods, thereby enhancing the framework’s utility for studying language from a diachronic perspective.

pdf bib abs

Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations
Nicolò Penzo | Maryam Sajedinia | Bruno Lepri | Sara Tonelli | Marco Guerini
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Assessing the performance of systems to classify Multi-Party Conversations (MPC) is challenging due to the interconnection between linguistic and structural characteristics of conversations. Conventional evaluation methods often overlook variances in model behavior across different levels of structural complexity on interaction graphs. In this work, we propose a methodological pipeline to investigate model performance across specific structural attributes of conversations. As a proof of concept we focus on Response Selection and Addressee Recognition tasks, to diagnose model weaknesses. To this end, we extract representative diagnostic subdatasets with a fixed number of users and a good structural variety from a large and open corpus of online MPCs. We further frame our work in terms of data minimization, avoiding the use of original usernames to preserve privacy, and propose alternatives to using original text messages. Results show that response selection relies more on the textual content of conversations, while addressee recognition requires capturing their structural dimension. Using an LLM in a zero-shot setting, we further highlight how sensitivity to prompt variations is task-dependent.

pdf bib abs

Don’t Augment, Rewrite? Assessing Abusive Language Detection with Synthetic Data
Camilla Casula | Elisa Leonardelli | Sara Tonelli
Findings of the Association for Computational Linguistics: ACL 2024

Research on abusive language detection and content moderation is crucial to combat online harm. However, current limitations set by regulatory bodies and social media platforms can make it difficult to share collected data. We address this challenge by exploring the possibility to replace existing datasets in English for abusive language detection with synthetic data obtained by rewriting original texts with an instruction-based generative model.We show that such data can be effectively used to train a classifier whose performance is in line, and sometimes better, than a classifier trained on original data. Training with synthetic data also seems to improve robustness in a cross-dataset setting. A manual inspection of the generated data confirms that rewriting makes it impossible to retrieve the original texts online.

2023

pdf bib abs

Scent and Sensibility: Perception Shifts in the Olfactory Domain
Teresa Paccosi | Stefano Menini | Elisa Leonardelli | Ilaria Barzon | Sara Tonelli
Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change

In this work, we investigate olfactory perception shifts, analysing how the description of the smells emitted by specific sources has changed over time. We first create a benchmark of selected smell sources, relying upon existing historical studies related to olfaction. We also collect an English text corpus by retrieving large collections of documents from freely available resources, spanning from 1500 to 2000 and covering different domains. We label such corpus using a system for olfactory information extraction inspired by frame semantics, where the semantic roles around the smell sources in the benchmark are marked. We then analyse how the roles describing Qualities of smell sources change over time and how they can contribute to characterise perception shifts, also in comparison with more standard statistical approaches.

pdf bib abs

Why Don’t You Do It Right? Analysing Annotators’ Disagreement in Subjective Tasks
Marta Sandri | Elisa Leonardelli | Sara Tonelli | Elisabetta Jezek
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Annotators’ disagreement in linguistic data has been recently the focus of multiple initiatives aimed at raising awareness on issues related to ‘majority voting’ when aggregating diverging annotations. Disagreement can indeed reflect different aspects of linguistic annotation, from annotators’ subjectivity to sloppiness or lack of enough context to interpret a text. In this work we first propose a taxonomy of possible reasons leading to annotators’ disagreement in subjective tasks. Then, we manually label part of a Twitter dataset for offensive language detection in English following this taxonomy, identifying how the different categories are distributed. Finally we run a set of experiments aimed at assessing the impact of the different types of disagreement on classification performance. In particular, we investigate how accurately tweets belonging to different categories of disagreement can be classified as offensive or not, and how injecting data with different types of disagreement in the training set affects performance. We also perform offensive language detection as a multi-task framework, using disagreement classification as an auxiliary task.

pdf bib abs

This work introduces a novel, extensive annotated corpus for multi-label legislative text classification in Italian, based on legal acts from the Gazzetta Ufficiale, the official source of legislative information of the Italian state. The annotated dataset, which we released to the community, comprises over 363,000 titles of legislative acts, spanning over 30 years from 1988 until 2022. Moreover, we evaluate four models for text classification on the dataset, demonstrating how using only the acts’ titles can achieve top-level classification performance, with a micro F1-score of 0.87. Also, our analysis shows how Italian domain-adapted legal models do not outperform general-purpose models on the task. Models’ performance can be checked by users via a demonstrator system provided in support of this work.

pdf bib abs

Scent Mining: Extracting Olfactory Events, Smell Sources and Qualities
Stefano Menini | Teresa Paccosi | Serra Sinem Tekiroğlu | Sara Tonelli
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Olfaction is a rather understudied sense compared to the other senses. In NLP, however, there have been recent attempts to develop taxonomies and benchmarks specifically designed to capture smell-related information. In this work, we further extend this research line by presenting a supervised system for olfactory information extraction in English. We cast this problem as a token classification task and build a system that identifies smell words, smell sources and qualities. The classifier is then applied to a set of English historical corpora, covering different domains and written in a time period between the 15th and the 20th Century. A qualitative analysis of the extracted data shows that they can be used to infer interesting information about smelly items such as tea and tobacco from a diachronical perspective, supporting historical investigation with corpus-based evidence.

pdf bib

When You Doubt, Abstain: A Study of Automated Fact-Checking in Italian under Domain Shift
Giovanni Valer | Alan Ramponi | Sara Tonelli
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

pdf bib abs

Generation-Based Data Augmentation for Offensive Language Detection: Is It Worth It?
Camilla Casula | Sara Tonelli
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Generation-based data augmentation (DA) has been presented in several works as a way to improve offensive language detection. However, the effectiveness of generative DA has been shown only in limited scenarios, and the potential injection of biases when using generated data to classify offensive language has not been investigated. Our aim is that of analyzing the feasibility of generative data augmentation more in-depth with two main focuses. First, we investigate the robustness of models trained on generated data in a variety of data augmentation setups, both novel and already presented in previous work, and compare their performance on four widely-used English offensive language datasets that present inherent differences in terms of content and complexity. In addition to this, we analyze models using the HateCheck suite, a series of functional tests created to challenge hate speech detection systems. Second, we investigate potential lexical bias issues through a qualitative analysis on the generated data. We find that the potential positive impact of generative data augmentation on model performance is unreliable, and generative DA can also have unpredictable effects on lexical bias.

2022

pdf bib abs

An Analysis of Abusive Language Data Collected through a Game with a Purpose
Federico Bonetti | Sara Tonelli
Proceedings of the 9th Workshop on Games and Natural Language Processing within the 13th Language Resources and Evaluation Conference

In this work we present an analysis of abusive language annotations collected through a 3D video game. With this approach, we are able to involve in the annotation teenagers, i.e. typical targets of cyberbullying, whose data are usually not available for research purposes. Using the game in the framework of educational activities to empower teenagers against online abuse we are able to obtain insights into how teenagers communicate, and what kind of messages they consider more offensive. While players produced interesting annotations and the distributions of classes between players and experts are similar, we obtained a significant number of mismatching judgements between experts and players.

pdf bib

Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis

pdf bib abs

BERToldo, the Historical BERT for Italian
Alessio Palmero Aprosio | Stefano Menini | Sara Tonelli
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

Recent works in historical language processing have shown that transformer-based models can be successfully created using historical corpora, and that using them for analysing and classifying data from the past can be beneficial compared to standard transformer models. This has led to the creation of BERT-like models for different languages trained with digital repositories from the past. In this work we introduce the Italian version of historical BERT, which we call BERToldo. We evaluate the model on the task of PoS-tagging Dante Alighieri’s works, considering not only the tagger performance but also the model size and the time needed to train it. We also address the problem of duplicated data, which is rather common for languages with a limited availability of historical corpora. We show that deduplication reduces training time without affecting performance. The model and its smaller versions are all made available to the research community.

We present a benchmark in six European languages containing manually annotated information about olfactory situations and events following a FrameNet-like approach. The documents selection covers ten domains of interest to cultural historians in the olfactory domain and includes texts published between 1620 to 1920, allowing a diachronic analysis of smell descriptions. With this work, we aim to foster the development of olfactory information extraction approaches as well as the analysis of changes in smell descriptions over time.

pdf bib abs

Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection
Alan Ramponi | Sara Tonelli
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Avoiding to rely on dataset artifacts to predict hate speech is at the cornerstone of robust and fair hate speech detection. In this paper we critically analyze lexical biases in hate speech detection via a cross-platform study, disentangling various types of spurious and authentic artifacts and analyzing their impact on out-of-distribution fairness and robustness. We experiment with existing approaches and propose simple yet surprisingly effective data-centric baselines. Our results on English data across four platforms show that distinct spurious artifacts require different treatments to ultimately attain both robustness and fairness in hate speech detection. To encourage research in this direction, we release all baseline models and the code to compute artifacts, pointing it out as a complementary and necessary addition to the data statements practice.

pdf bib

Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Gavin Abercrombie | Valerio Basile | Sara Tonelli | Verena Rieser | Alexandra Uma
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

pdf bib abs

Building a Multilingual Taxonomy of Olfactory Terms with Timestamps
Stefano Menini | Teresa Paccosi | Serra Sinem Tekiroğlu | Sara Tonelli
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Olfactory references play a crucial role in our memory and, more generally, in our experiences, since researchers have shown that smell is the sense that is most directly connected with emotions. Nevertheless, only few works in NLP have tried to capture this sensory dimension from a computational perspective. One of the main challenges is the lack of a systematic and consistent taxonomy of olfactory information, where concepts are organised also in a multi-lingual perspective. WordNet represents a valuable starting point in this direction, which can be semi-automatically extended taking advantage of Google n-grams and of existing language models. In this work we describe the process that has led to the semi-automatic development of a taxonomy for olfactory information in four languages (English, French, German and Italian), detailing the different steps and the intermediate evaluations. Along with being multi-lingual, the taxonomy also encloses temporal marks for olfactory terms thus making it a valuable resource for historical content analysis. The resource has been released and is freely available.

pdf bib abs

Work Hard, Play Hard: Collecting Acceptability Annotations through a 3D Game
Federico Bonetti | Elisa Leonardelli | Daniela Trotta | Raffaele Guarasci | Sara Tonelli
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Corpus-based studies on acceptability judgements have always stimulated the interest of researchers, both in theoretical and computational fields. Some approaches focused on spontaneous judgements collected through different types of tasks, others on data annotated through crowd-sourcing platforms, still others relied on expert annotated data available from the literature. The release of CoLA corpus, a large-scale corpus of sentences extracted from linguistic handbooks as examples of acceptable/non acceptable phenomena in English, has revived interest in the reliability of judgements of linguistic experts vs. non-experts. Several issues are still open. In this work, we contribute to this debate by presenting a 3D video game that was used to collect acceptability judgments on Italian sentences. We analyse the resulting annotations in terms of agreement among players and by comparing them with experts’ acceptability judgments. We also discuss different game settings to assess their impact on participants’ motivation and engagement. The final dataset containing 1,062 sentences, which were selected based on majority voting, is released for future research and comparisons.

2021

pdf bib abs

FrameNet-like Annotation of Olfactory Information in Texts
Sara Tonelli | Stefano Menini
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Although olfactory references play a crucial role in our cultural memory, only few works in NLP have tried to capture them from a computational perspective. Currently, the main challenge is not much the development of technological components for olfactory information extraction, given recent advances in semantic processing and natural language understanding, but rather the lack of a theoretical framework to capture this information from a linguistic point of view, as a preliminary step towards the development of automated systems. Therefore, in this work we present the annotation guidelines, developed with the help of history scholars and domain experts, aimed at capturing all the relevant elements involved in olfactory situations or events described in texts. These guidelines have been inspired by FrameNet annotation, but underwent some adaptations, which are detailed in this paper. Furthermore, we present a case study concerning the annotation of olfactory situations in English historical travel writings describing trips to Italy. An analysis of the most frequent role fillers show that olfactory descriptions pertain to some typical domains such as religion, food, nature, ancient past, poor sanitation, all supporting the creation of a stereotypical imagery related to Italy. On the other hand, positive feelings triggered by smells are prevalent, and contribute to framing travels to Italy as an exciting experience involving all senses.

pdf bib abs

Challenges in Designing Games with a Purpose for Abusive Language Annotation
Federico Bonetti | Sara Tonelli
Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing

In this paper we discuss several challenges related to the development of a 3D game, whose goal is to raise awareness on cyberbullying while collecting linguistic annotation on offensive language. The game is meant to be used by teenagers, thus raising a number of issues that need to be tackled during development. For example, the game aesthetics should be appealing for players belonging to this age group, but at the same time all possible solutions should be implemented to meet privacy requirements. Also, the task of linguistic annotation should be possibly hidden, adopting so-called orthogonal game mechanics, without affecting the quality of collected data. While some of these challenges are being tackled in the game development, some others are discussed in this paper but still lack an ultimate solution.

pdf bib abs

Are Gestures Worth a Thousand Words? An Analysis of Interviews in the Political Domain
Daniela Trotta | Sara Tonelli
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)

Speaker gestures are semantically co-expressive with speech and serve different pragmatic functions to accompany oral modality. Therefore, gestures are an inseparable part of the language system: they may add clarity to discourse, can be employed to facilitate lexical retrieval and retain a turn in conversations, assist in verbalizing semantic content and facilitate speakers in coming up with the words they intend to say. This aspect is particularly relevant in political discourse, where speakers try to apply communication strategies that are both clear and persuasive using verbal and non-verbal cues. In this paper we investigate the co-speech gestures of several Italian politicians during face-to-face interviews using a multimodal linguistic approach. We first enrich an existing corpus with a novel annotation layer capturing the function of hand movements. Then, we perform an analysis of the corpus, focusing in particular on the relationship between hand movements and other information layers such as the political party or non-lexical and semi-lexical tags. We observe that the recorded differences pertain more to single politicians than to the party they belong to, and that hand movements tend to occur frequently with semi-lexical phenomena, supporting the lexical retrieval hypothesis.

pdf bib abs

Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList
Marta Marchiori Manerba | Sara Tonelli
Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)

Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in real-world applications. In this paper, we create ad hoc tests through the CheckList tool (Ribeiro et al., 2020) to detect biases within abusive language classifiers for English. We compare the behaviour of two BERT-based models, one trained on a generic hate speech dataset and the other on a dataset for misogyny detection. Our evaluation shows that, although BERT-based classifiers achieve high accuracy levels on a variety of natural language processing tasks, they perform very poorly as regards fairness and bias, in particular on samples involving implicit stereotypes, expressions of hate towards minorities and protected attributes such as race or sexual orientation. We release both the notebooks implemented to extend the Fairness tests and the synthetic datasets usable to evaluate systems bias independently of CheckList.

pdf bib abs

Monolingual and Cross-Lingual Acceptability Judgments with the Italian CoLA corpus
Daniela Trotta | Raffaele Guarasci | Elisa Leonardelli | Sara Tonelli
Findings of the Association for Computational Linguistics: EMNLP 2021

The development of automated approaches to linguistic acceptability has been greatly fostered by the availability of the English CoLA corpus, which has also been included in the widely used GLUE benchmark. However, this kind of research for languages other than English, as well as the analysis of cross-lingual approaches, has been hindered by the lack of resources with a comparable size in other languages. We have therefore developed the ItaCoLA corpus, containing almost 10,000 sentences with acceptability judgments, which has been created following the same approach and the same steps as the English one. In this paper we describe the corpus creation, we detail its content, and we present the first experiments on this new resource. We compare in-domain and out-of-domain classification, and perform a specific evaluation of nine linguistic phenomena. We also present the first cross-lingual experiments, aimed at assessing whether multilingual transformer-based approaches can benefit from using sentences in two languages during fine-tuning.

pdf bib abs

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement
Elisa Leonardelli | Stefano Menini | Alessio Palmero Aprosio | Marco Guerini | Sara Tonelli
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators’ agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, in order to train more robust systems and better account for the different points of view expressed online.

pdf bib

It Is MarkIT That Is New: An Italian Treebank of Marked Constructions
Teresa Paccosi | Alessio Palmero Aprosio | Sara Tonelli
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

2020

pdf bib abs

Hybrid Emoji-Based Masked Language Models for Zero-Shot Abusive Language Detection
Michele Corazza | Stefano Menini | Elena Cabrio | Sara Tonelli | Serena Villata
Findings of the Association for Computational Linguistics: EMNLP 2020

Recent studies have demonstrated the effectiveness of cross-lingual language model pre-training on different NLP tasks, such as natural language inference and machine translation. In our work, we test this approach on social media data, which are particularly challenging to process within this framework, since the limited length of the textual messages and the irregularity of the language make it harder to learn meaningful encodings. More specifically, we propose a hybrid emoji-based Masked Language Model (MLM) to leverage the common information conveyed by emojis across different languages and improve the learned cross-lingual representation of short text messages, with the goal to perform zero- shot abusive language detection. We compare the results obtained with the original MLM to the ones obtained by our method, showing improved performance on German, Italian and Spanish.

pdf bib abs

FBK-DH at SemEval-2020 Task 12: Using Multi-channel BERT for Multilingual Offensive Language Detection
Camilla Casula | Alessio Palmero Aprosio | Stefano Menini | Sara Tonelli
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this paper we present our submission to sub-task A at SemEval 2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval2). For Danish, Turkish, Arabic and Greek, we develop an architecture based on transfer learning and relying on a two-channel BERT model, in which the English BERT and the multilingual one are combined after creating a machine-translated parallel corpus for each language in the task. For English, instead, we adopt a more standard, single-channel approach. We find that, in a multilingual scenario, with some languages having small training data, using parallel BERT models with machine translated data can give systems more stability, especially when dealing with noisy data. The fact that machine translation on social media data may not be perfect does not hurt the overall classification performance.

pdf bib abs

A 3D Role-Playing Game for Abusive Language Annotation
Federico Bonetti | Sara Tonelli
Workshop on Games and Natural Language Processing

Gamification has been applied to many linguistic annotation tasks, as an alternative to crowdsourcing platforms to collect annotated data in an inexpensive way. However, we think that still much has to be explored. Games with a Purpose (GWAPs) tend to lack important elements that we commonly see in commercial games, such as 2D and 3D worlds or a story. Making GWAPs more similar to full-fledged video games in order to involve users more easily and increase dissemination is a demanding yet interesting ground to explore. In this paper we present a 3D role-playing game for abusive language annotation that is currently under development.

pdf bib abs

Adding Gesture, Posture and Facial Displays to the PoliModal Corpus of Political Interviews
Daniela Trotta | Alessio Palmero Aprosio | Sara Tonelli | Annibale Elia
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper introduces a multimodal corpus in the political domain, which on top of transcribed face-to-face interviews presents the annotation of facial displays, hand gestures and body posture. While the fully annotated corpus consists of 3 interviews for a total of 90 minutes, it is extracted from a larger available corpus of 56 face-to-face interviews (14 hours) that has been manually annotated with information about metadata (i.e. tools used for the transcription, link to the interview etc.), pauses (used to mark a pause either between or within utterances), vocal expressions (marking non-lexical expressions such as burp and semi-lexical expressions such as primary interjections), deletions (false starts, repetitions and truncated words) and overlaps. In this work, we describe the additional level of annotation relating to nonverbal elements used by three Italian politicians belonging to three different political parties and who at the time of the talk-show were all candidates for the presidency of the Council of Minister. We also present the results of some analyses aimed at identifying existing relations between the proxemics phenomena and the linguistic structures in which they occur in order to capture recurring patterns and differences in the communication strategy.

pdf bib

Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling
Camilla Casula | Sara Tonelli
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib

A Multimodal Dataset of Images and Text to Study Abusive Language
Stefano Menini | Alessio Palmero Aprosio | Sara Tonelli
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib

Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language

pdf bib

The CREENDER Tool for Creating Multimodal Datasets of Images and Comments
Alessio Palmero Aprosio | Stefano Menini | Sara Tonelli
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

2019

pdf bib

Cross-Platform Evaluation for Italian Hate Speech Detection
Michele Corazza | Stefano Menini | Elena Cabrio | Sara Tonelli | Serena Villata
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib

Annotation and Analysis of the PoliModal Corpus of Political Interviews
Daniela Trotta | Sara Tonelli | Alessio Palmero Aprosio | Annibale Elia
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib abs

Social media platforms like Twitter and Instagram face a surge in cyberbullying phenomena against young users and need to develop scalable computational methods to limit the negative consequences of this kind of abuse. Despite the number of approaches recently proposed in the Natural Language Processing (NLP) research area for detecting different forms of abusive language, the issue of identifying cyberbullying phenomena at scale is still an unsolved problem. This is because of the need to couple abusive language detection on textual message with network analysis, so that repeated attacks against the same person can be identified. In this paper, we present a system to monitor cyberbullying phenomena by combining message classification and social network analysis. We evaluate the classification module on a data set built on Instagram messages, and we describe the cyberbullying monitoring user interface.

pdf bib abs

Novel Event Detection and Classification for Historical Texts
Rachele Sprugnoli | Sara Tonelli
Computational Linguistics, Volume 45, Issue 2 - June 2019

Event processing is an active area of research in the Natural Language Processing community, but resources and automatic systems developed so far have mainly addressed contemporary texts. However, the recognition and elaboration of events is a crucial step when dealing with historical texts Particularly in the current era of massive digitization of historical sources: Research in this domain can lead to the development of methodologies and tools that can assist historians in enhancing their work, while having an impact also on the field of Natural Language Processing. Our work aims at shedding light on the complex concept of events when dealing with historical texts. More specifically, we introduce new annotation guidelines for event mentions and types, categorized into 22 classes. Then, we annotate a historical corpus accordingly, and compare two approaches for automatic event detection and classification following this novel scheme. We believe that this work can foster research in a field of inquiry as yet underestimated in the area of Temporal Information Processing. To this end, we release new annotation guidelines, a corpus, and new models for automatic annotation.

pdf bib abs

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision
Alessio Palmero Aprosio | Sara Tonelli | Marco Turchi | Matteo Negri | Mattia A. Di Gangi
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation

Neural text simplification has gained increasing attention in the NLP community thanks to recent advancements in deep sequence-to-sequence learning. Most recent efforts with such a data-demanding paradigm have dealt with the English language, for which sizeable training datasets are currently available to deploy competitive models. Similar improvements on less resource-rich languages are conditioned either to intensive manual work to create training data, or to the design of effective automatic generation techniques to bypass the data acquisition bottleneck. Inspired by the machine translation field, in which synthetic parallel pairs generated from monolingual data yield significant improvements to neural models, in this paper we exploit large amounts of heterogeneous data to automatically select simple sentences, which are then used to create synthetic simplification pairs. We also evaluate other solutions, such as oversampling and the use of external word embeddings to be fed to the neural simplification system. Our approach is evaluated on Italian and Spanish, for which few thousand gold sentence pairs are available. The results show that these techniques yield performance improvements over a baseline sequence-to-sequence configuration.

pdf bib

Prendo la Parola in Questo Consesso Mondiale: A Multi-Genre 20th Century Corpus in the Political Domain
Sara Tonelli | Rachele Sprugnoli | Giovanni Moretti
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib

Automated Short Answer Grading: A Simple Solution for a Difficult Task
Stefano Menini | Sara Tonelli | Giovanni De Gasperis | Pierpaolo Vittorini
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

2018

pdf bib abs

Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying
Rachele Sprugnoli | Stefano Menini | Sara Tonelli | Filippo Oncini | Enrico Piras
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)

Although WhatsApp is used by teenagers as one major channel of cyberbullying, such interactions remain invisible due to the app privacy policies that do not allow ex-post data collection. Indeed, most of the information on these phenomena rely on surveys regarding self-reported data. In order to overcome this limitation, we describe in this paper the activities that led to the creation of a WhatsApp dataset to study cyberbullying among Italian students aged 12-13. We present not only the collected chats with annotations about user role and type of offense, but also the living lab created in a collaboration between researchers and schools to monitor and analyse cyberbullying. Finally, we discuss some open issues, dealing with ethical, operational and epistemic aspects.

pdf bib

Towards Personalised Simplification based on L2 Learners’ Native Language
Alessio Palmero Aprosio | Stefano Menini | Sara Tonelli | Luca Ducceschi | Leonardo Herzog
Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)

pdf bib

Analysing the Evolution of Students’ Writing Skills and the Impact of Neo-standard Italian with the help of Computational Linguistics
Rachele Sprugnoli | Sara Tonelli | Alessio Palmero Aprosio | Giovanni Moretti
Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)

2017

pdf bib abs

MUSST: A Multilingual Syntactic Simplification Tool
Carolina Scarton | Alessio Palmero Aprosio | Sara Tonelli | Tamara Martín Wanton | Lucia Specia
Proceedings of the IJCNLP 2017, System Demonstrations

We describe MUSST, a multilingual syntactic simplification tool. The tool supports sentence simplifications for English, Italian and Spanish, and can be easily extended to other languages. Our implementation includes a set of general-purpose simplification rules, as well as a sentence selection module (to select sentences to be simplified) and a confidence model (to select only promising simplifications). The tool was implemented in the context of the European project SIMPATICO on text simplification for Public Administration (PA) texts. Our evaluation on sentences in the PA domain shows that we obtain correct simplifications for 76% of the simplified cases in English, 71% of the cases in Spanish. For Italian, the results are lower (38%) but the tool is still under development.

pdf bib abs

You’ll Never Tweet Alone: Building Sports Match Timelines from Microblog Posts
Amosse Edouard | Elena Cabrio | Sara Tonelli | Nhan Le-Thanh
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

In this paper, we propose an approach to build a timeline with actions in a sports game based on tweets. We combine information provided by external knowledge bases to enrich the content of the tweets, and apply graph theory to model relations between actions and participants in a game. We demonstrate the validity of our approach using tweets collected during the EURO 2016 Championship and evaluate the output against live summaries produced by sports channels.

pdf bib abs

The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts
Rachele Sprugnoli | Tommaso Caselli | Sara Tonelli | Giovanni Moretti
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.

pdf bib abs

Graph-based Event Extraction from Twitter
Amosse Edouard | Elena Cabrio | Sara Tonelli | Nhan Le-Thanh
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Detecting which tweets describe a specific event and clustering them is one of the main challenging tasks related to Social Media currently addressed in the NLP community. Existing approaches have mainly focused on detecting spikes in clusters around specific keywords or Named Entities (NE). However, one of the main drawbacks of such approaches is the difficulty in understanding when the same keywords describe different events. In this paper, we propose a novel approach that exploits NE mentions in tweets and their entity context to create a temporal event graph. Then, using simple graph theory techniques and a PageRank-like algorithm, we process the event graphs to detect clusters of tweets describing the same events. Experiments on two gold standard datasets show that our approach achieves state-of-the-art results both in terms of evaluation performances and the quality of the detected events.

pdf bib

A little bit of bella pianura: Detecting Code-Mixing in Historical English Travel Writing
Rachele Sprugnoli | Sara Tonelli | Giovanni Moretti | Stefano Menini
Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)

pdf bib abs

RAMBLE ON: Tracing Movements of Popular Historical Figures
Stefano Menini | Rachele Sprugnoli | Giovanni Moretti | Enrico Bignotti | Sara Tonelli | Bruno Lepri
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics

We present RAMBLE ON, an application integrating a pipeline for frame-based information extraction and an interface to track and display movement trajectories. The code of the extraction pipeline and a navigator are freely available; moreover we display in a demonstrator the outcome of a case study carried out on trajectories of notable persons of the XX Century.

pdf bib abs

Building timelines of soccer matches from Twitter
Amosse Edouard | Elena Cabrio | Sara Tonelli | Nhan Le-Thanh
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

This demo paper presents a system that builds a timeline with salient actions of a soccer game, based on the tweets posted by users. It combines information provided by external knowledge bases to enrich the content of tweets and applies graph theory to model relations between actions (e.g. goals, penalties) and participants of a game (e.g. players, teams). In the demo, a web application displays in nearly real-time the actions detected from tweets posted by users for a given match of Euro 2016. Our tools are freely available at https://bitbucket.org/eamosse/event_tracking.

pdf bib abs

Topic-Based Agreement and Disagreement in US Electoral Manifestos
Stefano Menini | Federico Nanni | Simone Paolo Ponzetto | Sara Tonelli
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a topic-based analysis of agreement and disagreement in political manifestos, which relies on a new method for topic detection based on key concept clustering. Our approach outperforms both standard techniques like LDA and a state-of-the-art graph-based method, and provides promising initial results for this new task in computational social science.

pdf bib

The Impact of Phrases on Italian Lexical Simplification
Sara Tonelli | Alessio Palmero Aprosio | Marco Mazzon
Proceedings of the Fourth Italian Conference on Computational Linguistics (CLiC-it 2017)

2016

pdf bib abs

PreMOn: a Lemon Extension for Exposing Predicate Models as Linked Data
Francesco Corcoglioniti | Marco Rospocher | Alessio Palmero Aprosio | Sara Tonelli
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We introduce PreMOn (predicate model for ontologies), a linguistic resource for exposing predicate models (PropBank, NomBank, VerbNet, and FrameNet) and mappings between them (e.g, SemLink) as Linked Open Data. It consists of two components: (i) the PreMOn Ontology, an extension of the lemon model by the W3C Ontology-Lexica Community Group, that enables to homogeneously represent data from the various predicate models; and, (ii) the PreMOn Dataset, a collection of RDF datasets integrating various versions of the aforementioned predicate models and mapping resources. PreMOn is freely available and accessible online in different ways, including through a dedicated SPARQL endpoint.

pdf bib abs

CATENA: CAusal and TEmporal relation extraction from NAtural language texts
Paramita Mirza | Sara Tonelli
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present CATENA, a sieve-based system to perform temporal and causal relation extraction and classification from English texts, exploiting the interaction between the temporal and the causal model. We evaluate the performance of each sieve, showing that the rule-based, the machine-learned and the reasoning components all contribute to achieving state-of-the-art performance on TempEval-3 and TimeBank-Dense data. Although causal relations are much sparser than temporal ones, the architecture and the selected features are mostly suitable to serve both tasks. The effects of the interaction between the temporal and the causal components, although limited, yield promising results and confirm the tight connection between the temporal and the causal dimension of texts.

pdf bib abs

Agreement and Disagreement: Comparison of Points of View in the Political Domain
Stefano Menini | Sara Tonelli
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

The automated comparison of points of view between two politicians is a very challenging task, due not only to the lack of annotated resources, but also to the different dimensions participating to the definition of agreement and disagreement. In order to shed light on this complex task, we first carry out a pilot study to manually annotate the components involved in detecting agreement and disagreement. Then, based on these findings, we implement different features to capture them automatically via supervised classification. We do not focus on debates in dialogical form, but we rather consider sets of documents, in which politicians may express their position with respect to different topics in an implicit or explicit way, like during an electoral campaign. We create and make available three different datasets.

pdf bib abs

On the contribution of word embeddings to temporal relation classification
Paramita Mirza | Sara Tonelli
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Temporal relation classification is a challenging task, especially when there are no explicit markers to characterise the relation between temporal entities. This occurs frequently in inter-sentential relations, whose entities are not connected via direct syntactic relations making classification even more difficult. In these cases, resorting to features that focus on the semantic content of the event words may be very beneficial for inferring implicit relations. Specifically, while morpho-syntactic and context features are considered sufficient for classifying event-timex pairs, we believe that exploiting distributional semantic information about event words can benefit supervised classification of other types of pairs. In this work, we assess the impact of using word embeddings as features for event words in classifying temporal relations of event-event pairs and event-DCT (document creation time) pairs.

pdf bib abs

NLP and Public Engagement: The Case of the Italian School Reform
Tommaso Caselli | Giovanni Moretti | Rachele Sprugnoli | Sara Tonelli | Damien Lanfrey | Donatella Solda Kutzmann
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we present PIERINO (PIattaforma per l’Estrazione e il Recupero di INformazione Online), a system that was implemented in collaboration with the Italian Ministry of Education, University and Research to analyse the citizens’ comments given in #labuonascuola survey. The platform includes various levels of automatic analysis such as key-concept extraction and word co-occurrences. Each analysis is displayed through an intuitive view using different types of visualizations, for example radar charts and sunburst. PIERINO was effectively used to support shaping the last Italian school reform, proving the potential of NLP in the context of policy making.

2015

pdf bib

Recognizing Biographical Sections in Wikipedia
Alessio Palmero Aprosio | Sara Tonelli
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib

Annotating Causality in the TempEval-3 Corpus
Paramita Mirza | Rachele Sprugnoli | Sara Tonelli | Manuela Speranza
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

pdf bib

An Analysis of Causality between Events and its Relation to Temporal Information
Paramita Mirza | Sara Tonelli
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib

Classifying Temporal Relations with Simple Features
Paramita Mirza | Sara Tonelli
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib abs

CROMER: a Tool for Cross-Document Event and Entity Coreference
Christian Girardi | Manuela Speranza | Rachele Sprugnoli | Sara Tonelli
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present CROMER (CROss-document Main Events and entities Recognition), a novel tool to manually annotate event and entity coreference across clusters of documents. The tool has been developed so as to handle large collections of documents, perform collaborative annotation (several annotators can work on the same clusters), and enable the linking of the annotated data to external knowledge sources. Given the availability of semantic information encoded in Semantic Web resources, this tool is designed to support annotators in linking entities and events to DBPedia and Wikipedia, so as to facilitate the automatic retrieval of additional semantic information. In this way, event modelling and chaining is made easy, while guaranteeing the highest interconnection with external resources. For example, the tool can be easily linked to event models such as the Simple Event Model [Van Hage et al , 2011] and the Grounded Annotation Framework [Fokkens et al. 2013].

2013

pdf bib

Outsourcing FrameNet to the Crowd
Marco Fossati | Claudio Giuliano | Sara Tonelli
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib

pdf bib

Mining Fine-grained Opinion Expressions with Shallow Parsing
Sucheta Ghosh | Sara Tonelli | Richard Johansson
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib

FBK: Sentiment Analysis in Twitter with Tweetsted
Md. Faisal Mahbub Chowdhury | Marco Guerini | Sara Tonelli | Alberto Lavelli
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib abs

Improving the Recall of a Discourse Parser by Constraint-based Postprocessing
Sucheta Ghosh | Richard Johansson | Giuseppe Riccardi | Sara Tonelli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe two constraint-based methods that can be used to improve the recall of a shallow discourse parser based on conditional random field chunking. These method uses a set of natural structural constraints as well as others that follow from the annotation guidelines of the Penn Discourse Treebank. We evaluated the resulting systems on the standard test set of the PDTB and achieved a rebalancing of precision and recall with improved F-measures across the board. This was especially notable when we used evaluation metrics taking partial matches into account; for these measures, we achieved F-measure improvements of several points.

pdf bib

Hunting for Entailing Pairs in the Penn Discourse Treebank
Sara Tonelli | Elena Cabrio
Proceedings of COLING 2012

pdf bib

Key-concept extraction from French articles with KX
Sara Tonelli | Elena Cabrio | Emanuele Pianta
JEP-TALN-RECITAL 2012, Workshop DEFT 2012: DÉfi Fouille de Textes (DEFT 2012 Workshop: Text Mining Challenge)

pdf bib

Making Readability Indices Readable
Sara Tonelli | Ke Tran Manh | Emanuele Pianta
Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations

2011

pdf bib

Desperately Seeking Implicit Arguments in Text
Sara Tonelli | Rodolfo Delmonte
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf bib

Shallow Discourse Parsing with Conditional Random Fields
Sucheta Ghosh | Richard Johansson | Giuseppe Riccardi | Sara Tonelli
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib

VENSES++: Adapting a deep semantic processing system to the identification of null instantiations
Sara Tonelli | Rodolfo Delmonte
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib abs

VenPro: A Morphological Analyzer for Venetan
Sara Tonelli | Emanuele Pianta | Rodolfo Delmonte | Michele Brunelli
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This document reports the process of extending MorphoPro for Venetan, a lesser-used language spoken in the Nort-Eastern part of Italy. MorphoPro is the morphological component of TextPro, a suite of tools oriented towards a number of NLP tasks. In order to extend this component to Venetan, we developed a declarative representation of the morphological knowledge necessary to analyze and synthesize Venetan words. This task was challenging for several reasons, which are common to a number of lesser-used languages: although Venetan is widely used as an oral language in everyday life, its written usage is very limited; efforts for defining a standard orthography and grammar are very recent and not well established; despite recent attempts to propose a unified orthography, no Venetan standard is widely used. Besides, there are different geographical varieties and it is strongly influenced by Italian.

pdf bib

KX: A Flexible System for Keyphrase eXtraction
Emanuele Pianta | Sara Tonelli
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib abs

Annotation of Discourse Relations for Conversational Spoken Dialogs
Sara Tonelli | Giuseppe Riccardi | Rashmi Prasad | Aravind Joshi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we make a qualitative and quantitative analysis of discourse relations within the LUNA conversational spoken dialog corpus. In particular, we first describe the Penn Discourse Treebank (PDTB) and then we detail the adaptation of its annotation scheme to the LUNA corpus of Italian task-oriented dialogs in the domain of software/hardware assistance. We discuss similarities and differences between our approach and the PDTB paradigm and point out the peculiarities of spontaneous dialogs w.r.t. written text, which motivated some changes in the annotation strategy. In particular, we introduced the annotation of relations between non-contiguous arguments and we modified the sense hierarchy in order to take into account the important role of pragmatics in dialogs. In the final part of the paper, we present a comparison between the sense and connective frequency in a representative subset of the LUNA corpus and in the PDTB. Such analysis confirmed the differences between the two corpora and corroborates our choice to introduce dialog-specific adaptations.

2009

pdf bib

Annotating Spoken Dialogs: From Speech Segments to Dialog Acts and Frame Semantics
Marco Dinarelli | Silvia Quarteroni | Sara Tonelli | Alessandro Moschitti | Giuseppe Riccardi
Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language

pdf bib

A novel approach to mapping FrameNet lexical units to WordNet synsets (short paper)
Sara Tonelli | Emanuele Pianta
Proceedings of the Eight International Conference on Computational Semantics

pdf bib

Wikipedia as Frame Information Repository
Sara Tonelli | Claudio Giuliano
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib

New Features for FrameNet - WordNet Mapping
Sara Tonelli | Daniele Pighin
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)

pdf bib

Three Issues in Cross-Language Frame Information Transfer
Sara Tonelli | Emanuele Pianta
Proceedings of the International Conference RANLP-2009

2008

pdf bib abs

Frame Information Transfer from English to Italian
Sara Tonelli | Emanuele Pianta
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe an automatic projection algorithm for transferring frame-semantic information from English to Italian texts as a first sep towards the creation of Italian FrameNet. Given an English text with frame information and its Italian translation, we project the annotation in four steps: first the Italian text is parsed, then English-Italian alignment is automatically carried out at word level, then we extract the semantic head for every annotated constituent on the English corpus side and finally we project annotation from English to Italian using aligned semantic heads as bridge. With our work, we point out typical features of the Italian language as regards frame-semantic annotation, in particular we describe peculiarities of Italian that at the moment make the projection task more difficult than in the above-mentioned examples. Besides, we created a gold standard with 987 manually annotated sentences to evaluate the algorithm.

pdf bib abs

Enriching the Venice Italian Treebank with Dependency and Grammatical Relations
Sara Tonelli | Rodolfo Delmonte | Antonella Bristot
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we propose a rule-based approach to extract dependency and grammatical functions from the Venice Italian Treebank, a Treebank of written text with PoS and constituent labels consisting of 10,200 utterances and about 274,000 tokens. As manual corpus annotation is expensive and time-consuming, we decided to exploit this existing constituency-based Treebank to derive dependency structures with lower effort. After describing the procedure to extract heads and dependents, based on a head percolation table for Italian, we introduce the rules adopted to add grammatical relation labels. To this purpose, we manually relabeled all non-canonical arguments, which are very frequent in Italian, then we automatically labeled the remaining complements or arguments following some syntactic restrictions based on the position of the constituents w.r.t to parent and sibling nodes. The final section of the paper describes evaluation results. Evaluation was carried out in two steps, one for dependency relations and one for grammatical roles. Results are in line with similar conversion algorithms carried out for other languages, with 0.97 precision on dependency arcs and F-measure for the main grammatical functions scoring 0.96 or above, except for obliques with 0.75.

Sara Tonelli

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Co-authors

Venues