Teresa Paccosi

2025

Detecting Changing Culinary Trends Through Historical Recipes
Gauri Bhagwat | Marieke van Erp | Teresa Paccosi | Rik Hoekstra
Proceedings of the 5th Conference on Language, Data and Knowledge

Culinary trends evolve in response to social, economic, and cultural influences, reflecting broader historical transformations. We present an exploration into Dutch culinary trends from 1910 to 1995 by analysing recipes from housekeeping school cookbooks and newspaper recipe collections. Using computational techniques, we extract and examine ingredient frequency, recipe complexity, and shifts in recipe categories to identify trends in Dutch cuisine from a quantitative point of view. Additionally, we experimented with Large Language Models (LLMs) to structure and extract recipes’ features, demonstrating their potential for historical recipe parsing.

2024

pdf bib abs

A New Annotation Scheme for the Semantics of Taste
Teresa Paccosi | Sara Tonelli
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024

This paper introduces a new annotation scheme for the semantics of gustatory language in English, which builds upon a previous framework for olfactory language based on frame semantics. The purpose of this annotation framework is to be used for annotating comparable resources for the study of sensory language and to create training datasets for supervised systems aimed at extracting sensory information. Furthermore, our approach incorporates words from specific historical periods, thereby enhancing the framework’s utility for studying language from a diachronic perspective.

pdf bib abs

Benchmarking the Semantics of Taste: Towards the Automatic Extraction of Gustatory Language
Teresa Paccosi | Sara Tonelli
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

In this paper, we present a benchmark containing texts manually annotated with gustatory semantic information. We employ a FrameNet-like approach previously tested to address olfactory language, which we adapt to capture gustatory events. We then propose an exploration of the data in the benchmark to show the possible insights brought by this type of approach, addressing the investigation of emotional valence in text genres. Eventually, we present a supervised system trained with the taste benchmark for the extraction of gustatory information from historical and contemporary texts.

2023

pdf bib abs

Scent and Sensibility: Perception Shifts in the Olfactory Domain
Teresa Paccosi | Stefano Menini | Elisa Leonardelli | Ilaria Barzon | Sara Tonelli
Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change

In this work, we investigate olfactory perception shifts, analysing how the description of the smells emitted by specific sources has changed over time. We first create a benchmark of selected smell sources, relying upon existing historical studies related to olfaction. We also collect an English text corpus by retrieving large collections of documents from freely available resources, spanning from 1500 to 2000 and covering different domains. We label such corpus using a system for olfactory information extraction inspired by frame semantics, where the semantic roles around the smell sources in the benchmark are marked. We then analyse how the roles describing Qualities of smell sources change over time and how they can contribute to characterise perception shifts, also in comparison with more standard statistical approaches.

pdf bib abs

Scent Mining: Extracting Olfactory Events, Smell Sources and Qualities
Stefano Menini | Teresa Paccosi | Serra Sinem Tekiroğlu | Sara Tonelli
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Olfaction is a rather understudied sense compared to the other senses. In NLP, however, there have been recent attempts to develop taxonomies and benchmarks specifically designed to capture smell-related information. In this work, we further extend this research line by presenting a supervised system for olfactory information extraction in English. We cast this problem as a token classification task and build a system that identifies smell words, smell sources and qualities. The classifier is then applied to a set of English historical corpora, covering different domains and written in a time period between the 15th and the 20th Century. A qualitative analysis of the extracted data shows that they can be used to infer interesting information about smelly items such as tea and tobacco from a diachronical perspective, supporting historical investigation with corpus-based evidence.

2022

We present a benchmark in six European languages containing manually annotated information about olfactory situations and events following a FrameNet-like approach. The documents selection covers ten domains of interest to cultural historians in the olfactory domain and includes texts published between 1620 to 1920, allowing a diachronic analysis of smell descriptions. With this work, we aim to foster the development of olfactory information extraction approaches as well as the analysis of changes in smell descriptions over time.

pdf bib abs

KIND: an Italian Multi-Domain Dataset for Named Entity Recognition
Teresa Paccosi | Alessio Palmero Aprosio
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper we present KIND, an Italian dataset for Named-entity recognition. It contains more than one million tokens with annotation covering three classes: person, location, and organization. The dataset (around 600K tokens) mostly contains manual gold annotations in three different domains (news, literature, and political discourses) and a semi-automatically annotated part. The multi-domain feature is the main strength of the present work, offering a resource which covers different styles and language uses, as well as the largest Italian NER dataset with manual gold annotations. It represents an important resource for the training of NER systems in Italian. Texts and annotations are freely downloadable from the Github repository.

pdf bib abs

Building a Multilingual Taxonomy of Olfactory Terms with Timestamps
Stefano Menini | Teresa Paccosi | Serra Sinem Tekiroğlu | Sara Tonelli
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Olfactory references play a crucial role in our memory and, more generally, in our experiences, since researchers have shown that smell is the sense that is most directly connected with emotions. Nevertheless, only few works in NLP have tried to capture this sensory dimension from a computational perspective. One of the main challenges is the lack of a systematic and consistent taxonomy of olfactory information, where concepts are organised also in a multi-lingual perspective. WordNet represents a valuable starting point in this direction, which can be semi-automatically extended taking advantage of Google n-grams and of existing language models. In this work we describe the process that has led to the semi-automatic development of a taxonomy for olfactory information in four languages (English, French, German and Italian), detailing the different steps and the intermediate evaluations. Along with being multi-lingual, the taxonomy also encloses temporal marks for olfactory terms thus making it a valuable resource for historical content analysis. The resource has been released and is freely available.