Alan Ramponi


2023

pdf bib
DiatopIt: A Corpus of Social Media Posts for the Study of Diatopic Language Variation in Italy
Alan Ramponi | Camilla Casula
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023)

We introduce DiatopIt, the first corpus specifically focused on diatopic language variation in Italy for language varieties other than Standard Italian. DiatopIt comprises over 15K geolocated social media posts from Twitter over a period of two years, including regional Italian usage and content fully written in local language varieties or exhibiting code-switching with Standard Italian. We detail how we tackled key challenges in creating such a resource, including the absence of orthography standards for most local language varieties and the lack of reliable language identification tools. We assess the representativeness of DiatopIt across time and space, and show that the density of non-Standard Italian content across areas correlates with actual language use. We finally conduct computational experiments and find that modeling diatopic variation on highly multilingual areas such as Italy is a complex task even for recent language models.

2022

pdf bib
DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection
Alan Ramponi | Elisa Leonardelli
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

The subtle and typically unconscious use of patronizing and condescending language (PCL) in large-audience media outlets undesirably feeds stereotypes and strengthens power-knowledge relationships, perpetuating discrimination towards vulnerable communities. Due to its subjective and subtle nature, PCL detection is an open and challenging problem, both for computational methods and human annotators. In this paper we describe the systems submitted by the DH-FBK team to SemEval-2022 Task 4, aiming at detecting PCL towards vulnerable communities in English media texts. Motivated by the subjectivity of human interpretation, we propose to leverage annotators’ uncertainty and disagreement to better capture the shades of PCL in a multi-task, multi-view learning framework. Our approach achieves competitive results, largely outperforming baselines and ranking on the top-left side of the leaderboard on both PCL identification and classification. Noticeably, our approach does not rely on any external data or model ensemble, making it a viable and attractive solution for real-world use.

pdf bib
Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection
Alan Ramponi | Sara Tonelli
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Avoiding to rely on dataset artifacts to predict hate speech is at the cornerstone of robust and fair hate speech detection. In this paper we critically analyze lexical biases in hate speech detection via a cross-platform study, disentangling various types of spurious and authentic artifacts and analyzing their impact on out-of-distribution fairness and robustness. We experiment with existing approaches and propose simple yet surprisingly effective data-centric baselines. Our results on English data across four platforms show that distinct spurious artifacts require different treatments to ultimately attain both robustness and fairness in hate speech detection. To encourage research in this direction, we release all baseline models and the code to compute artifacts, pointing it out as a complementary and necessary addition to the data statements practice.

2021

pdf bib
From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding
Rob van der Goot | Ibrahim Sharaf | Aizhan Imankulova | Ahmet Üstün | Marija Stepanović | Alan Ramponi | Siti Oryza Khairunnisa | Mamoru Komachi | Barbara Plank
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The lack of publicly available evaluation data for low-resource languages limits progress in Spoken Language Understanding (SLU). As key tasks like intent classification and slot filling require abundant training data, it is desirable to reuse existing data in high-resource languages to develop models for low-resource scenarios. We introduce xSID, a new benchmark for cross-lingual (x) Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect. To tackle the challenge, we propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer. We study two setups which differ by type and language coverage of the pre-trained embeddings. Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.

pdf bib
Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP
Rob van der Goot | Ahmet Üstün | Alan Ramponi | Ibrahim Sharaf | Barbara Plank
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Transfer learning, particularly approaches that combine multi-task learning with pre-trained contextualized embeddings and fine-tuning, have advanced the field of Natural Language Processing tremendously in recent years. In this paper we present MaChAmp, a toolkit for easy fine-tuning of contextualized embeddings in multi-task settings. The benefits of MaChAmp are its flexible configuration options, and the support of a variety of natural language processing tasks in a uniform toolkit, from text classification and sequence labeling to dependency parsing, masked language modeling, and text generation.

pdf bib
MultiLexNorm: A Shared Task on Multilingual Lexical Normalization
Rob van der Goot | Alan Ramponi | Arkaitz Zubiaga | Barbara Plank | Benjamin Muller | Iñaki San Vicente Roncal | Nikola Ljubešić | Özlem Çetinoğlu | Rahmad Mahendra | Talha Çolakoğlu | Timothy Baldwin | Tommaso Caselli | Wladimir Sidorenko
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

Lexical normalization is the task of transforming an utterance into its standardized form. This task is beneficial for downstream analysis, as it provides a way to harmonize (often spontaneous) linguistic variation. Such variation is typical for social media on which information is shared in a multitude of ways, including diverse languages and code-switching. Since the seminal work of Han and Baldwin (2011) a decade ago, lexical normalization has attracted attention in English and multiple other languages. However, there exists a lack of a common benchmark for comparison of systems across languages with a homogeneous data and evaluation setup. The MultiLexNorm shared task sets out to fill this gap. We provide the largest publicly available multilingual lexical normalization benchmark including 13 language variants. We propose a homogenized evaluation setup with both intrinsic and extrinsic evaluation. As extrinsic evaluation, we use dependency parsing and part-of-speech tagging with adapted evaluation metrics (a-LAS, a-UAS, and a-POS) to account for alignment discrepancies. The shared task hosted at W-NUT 2021 attracted 9 participants and 18 submissions. The results show that neural normalization systems outperform the previous state-of-the-art system by a large margin. Downstream parsing and part-of-speech tagging performance is positively affected but to varying degrees, with improvements of up to 1.72 a-LAS, 0.85 a-UAS, and 1.54 a-POS for the winning system.

2020

pdf bib
Biomedical Event Extraction as Sequence Labeling
Alan Ramponi | Rob van der Goot | Rosario Lombardo | Barbara Plank
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We introduce Biomedical Event Extraction as Sequence Labeling (BeeSL), a joint end-to-end neural information extraction model. BeeSL recasts the task as sequence labeling, taking advantage of a multi-label aware encoding strategy and jointly modeling the intermediate tasks via multi-task learning. BeeSL is fast, accurate, end-to-end, and unlike current methods does not require any external knowledge base or preprocessing tools. BeeSL outperforms the current best system (Li et al., 2019) on the Genia 2011 benchmark by 1.57% absolute F1 score reaching 60.22% F1, establishing a new state of the art for the task. Importantly, we also provide first results on biomedical event extraction without gold entity information. Empirical results show that BeeSL’s speed and accuracy makes it a viable approach for large-scale real-world scenarios.

pdf bib
Cross-Domain Evaluation of Edge Detection for Biomedical Event Extraction
Alan Ramponi | Barbara Plank | Rosario Lombardo
Proceedings of the Twelfth Language Resources and Evaluation Conference

Biomedical event extraction is a crucial task in order to automatically extract information from the increasingly growing body of biomedical literature. Despite advances in the methods in recent years, most event extraction systems are still evaluated in-domain and on complete event structures only. This makes it hard to determine the performance of intermediate stages of the task, such as edge detection, across different corpora. Motivated by these limitations, we present the first cross-domain study of edge detection for biomedical event extraction. We analyze differences between five existing gold standard corpora, create a standardized benchmark corpus, and provide a strong baseline model for edge detection. Experiments show a large drop in performance when the baseline is applied on out-of-domain data, confirming the need for domain adaptation methods for the task. To encourage research efforts in this direction, we make both the data and the baseline available to the research community: https://www.cosbi.eu/cfx/9985.

pdf bib
Norm It! Lexical Normalization for Italian and Its Downstream Effects for Dependency Parsing
Rob van der Goot | Alan Ramponi | Tommaso Caselli | Michele Cafagna | Lorenzo De Mattei
Proceedings of the Twelfth Language Resources and Evaluation Conference

Lexical normalization is the task of translating non-standard social media data to a standard form. Previous work has shown that this is beneficial for many downstream tasks in multiple languages. However, for Italian, there is no benchmark available for lexical normalization, despite the presence of many benchmarks for other tasks involving social media data. In this paper, we discuss the creation of a lexical normalization dataset for Italian. After two rounds of annotation, a Cohen’s kappa score of 78.64 is obtained. During this process, we also analyze the inter-annotator agreement for this task, which is only rarely done on datasets for lexical normalization,and when it is reported, the analysis usually remains shallow. Furthermore, we utilize this dataset to train a lexical normalization model and show that it can be used to improve dependency parsing of social media data. All annotated data and the code to reproduce the results are available at: http://bitbucket.org/robvanderg/normit.

pdf bib
Neural Unsupervised Domain Adaptation in NLPA Survey
Alan Ramponi | Barbara Plank
Proceedings of the 28th International Conference on Computational Linguistics

Deep neural networks excel at learning from labeled data and achieve state-of-the-art results on a wide array of Natural Language Processing tasks. In contrast, learning from unlabeled data, especially under domain shift, remains a challenge. Motivated by the latest advances, in this survey we review neural unsupervised domain adaptation techniques which do not require labeled target domain data. This is a more challenging yet a more widely applicable setup. We outline methods, from early traditional non-neural methods to pre-trained model transfer. We also revisit the notion of domain, and we uncover a bias in the type of Natural Language Processing tasks which received most attention. Lastly, we outline future directions, particularly the broader need for out-of-distribution generalization of future NLP.