Lucia Donatelli


pdf bib
A Corpus of German Abstract Meaning Representation (DeAMR)
Christoph Otto | Jonas Groschwitz | Alexander Koller | Xiulin Yang | Lucia Donatelli
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We present the first comprehensive set of guidelines for German Abstract Meaning Representation (Deutsche AMR, DeAMR) along with an annotated corpus of 400 DeAMR. Taking English AMR (EnAMR) as our starting point, we propose significant adaptations to faithfully represent the structure and semantics of German, focusing particularly on verb frames, compound words, and modality. We validate our annotation through inter-annotator agreement and further evaluate our corpus with a comparison of structural divergences between EnAMR and DeAMR on parallel sentences, replicating previous work that finds both cases of cross-lingual structural alignment and cases of meaningful linguistic divergence. Finally, we fine-tune state-of-the-art multi-lingual and cross-lingual AMR parsers on our corpus and find that, while our small corpus is insufficient to produce quality output, there is a need to continue develop and evaluate against gold non-English AMR data.

pdf bib
Encoding Gesture in Multimodal Dialogue: Creating a Corpus of Multimodal AMR
Kenneth Lai | Richard Brutti | Lucia Donatelli | James Pustejovsky
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Abstract Meaning Representation (AMR) is a general-purpose meaning representation that has become popular for its clear structure, ease of annotation and available corpora, and overall expressiveness. While AMR was designed to represent sentence meaning in English text, recent research has explored its adaptation to broader domains, including documents, dialogues, spatial information, cross-lingual tasks, and gesture. In this paper, we present an annotated corpus of multimodal (speech and gesture) AMR in a task-based setting. Our corpus is multilayered, containing temporal alignments to both the speech signal and to descriptions of gesture morphology. We also capture coreference relationships across modalities, enabling fine-grained analysis of how the semantics of gesture and natural language interact. We discuss challenges that arise when identifying cross-modal coreference and anaphora, as well as in creating and evaluating multimodal corpora in general. Although we find AMR’s abstraction away from surface form (in both language and gesture) occasionally too coarse-grained to capture certain cross-modal interactions, we believe its flexibility allows for future work to fill in these gaps. Our corpus and annotation guidelines are available at

pdf bib
SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
Stephanie M. Lukin | Claire Bonial | Matthew Marge | Taylor A. Hudson | Cory J. Hayes | Kimberly Pollard | Anthony Baker | Ashley N. Foots | Ron Artstein | Felix Gervits | Mitchell Abrams | Cassidy Henry | Lucia Donatelli | Anton Leuski | Susan G. Hill | David Traum | Clare Voss
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker’s intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.

pdf bib
SPOTTER: A Framework for Investigating Convention Formation in a Visually Grounded Human-Robot Reference Task
Jaap Kruijt | Peggy van Minkelen | Lucia Donatelli | Piek T.J.M. Vossen | Elly Konijn | Thomas Baier
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Linguistic conventions that arise in dialogue reflect common ground and can increase communicative efficiency. Social robots that can understand these conventions and the process by which they arise have the potential to become efficient communication partners. Nevertheless, it is unclear how robots can engage in convention formation when presented with both familiar and new information. We introduce an adaptable game platform, SPOTTER, to study the dynamics of convention formation for visually grounded referring expressions in both human-human and human-robot interaction. Specifically, we seek to elicit convention forming for members of an inner circle of well-known individuals in the common ground, as opposed to individuals from an outer circle, who are unfamiliar. We release an initial corpus of 5000 utterances from two exploratory pilot experiments in Dutch. Different from previous work focussing on human-human interaction, we find that referring expressions for both familiar and unfamiliar individuals maintain their length throughout human-robot interaction. Stable conventions are formed, although these conventions can be impacted by distracting outer circle individuals. With our distinction between familiar and unfamiliar, we create a contrastive operationalization of common ground, which aids research into convention formation.

pdf bib
Cultural Adaptation of Recipes
Yong Cao | Yova Kementchedjhieva | Ruixiang Cui | Antonia Karamolegkou | Li Zhou | Megan Dare | Lucia Donatelli | Daniel Hershcovich
Transactions of the Association for Computational Linguistics, Volume 12

Building upon the considerable advances in Large Language Models (LLMs), we are now equipped to address more sophisticated tasks demanding a nuanced understanding of cross-cultural contexts. A key example is recipe adaptation, which goes beyond simple translation to include a grasp of ingredients, culinary techniques, and dietary preferences specific to a given culture. We introduce a new task involving the translation and cultural adaptation of recipes between Chinese- and English-speaking cuisines. To support this investigation, we present CulturalRecipes, a unique dataset composed of automatically paired recipes written in Mandarin Chinese and English. This dataset is further enriched with a human-written and curated test set. In this intricate task of cross-cultural recipe adaptation, we evaluate the performance of various methods, including GPT-4 and other LLMs, traditional machine translation, and information retrieval techniques. Our comprehensive analysis includes both automatic and human evaluation metrics. While GPT-4 exhibits impressive abilities in adapting Chinese recipes into English, it still lags behind human expertise when translating English recipes into Chinese. This underscores the multifaceted nature of cultural adaptations. We anticipate that these insights will significantly contribute to future research on culturally aware language models and their practical application in culturally diverse contexts.


pdf bib
From Sentence to Action: Splitting AMR Graphs for Recipe Instructions
Katharina Stein | Lucia Donatelli | Alexander Koller
Proceedings of the Fourth International Workshop on Designing Meaning Representations

Accurately interpreting the relationships between actions in a recipe text is essential to successful recipe completion. We explore using Abstract Meaning Representation (AMR) to represent recipe instructions, abstracting away from syntax and sentence structure that may order recipe actions in arbitrary ways. We present an algorithm to split sentence-level AMRs into action-level AMRs for individual cooking steps. Our approach provides an automatic way to derive fine-grained AMR representations of actions in cooking recipes and can be a useful tool for downstream, instructional tasks.

pdf bib
SLOG: A Structural Generalization Benchmark for Semantic Parsing
Bingzhi Li | Lucia Donatelli | Alexander Koller | Tal Linzen | Yuekun Yao | Najoung Kim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions. Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training; structural generalization tasks, where a model needs to interpret syntactic structures that are themselves unfamiliar from training, are often underrepresented, resulting in overly optimistic perceptions of how well models can generalize. We introduce SLOG, a semantic parsing dataset that extends COGS (Kim and Linzen, 2020) with 17 structural generalization cases. In our experiments, the generalization accuracy of Transformer models, including pretrained ones, only reaches 40.6%, while a structure-aware parser only achieves 70.8%. These results are far from the near-perfect accuracy existing models achieve on COGS, demonstrating the role of SLOG in foregrounding the large discrepancy between models’ lexical and structural generalization capacities.

pdf bib
AMR Parsing is Far from Solved: GrAPES, the Granular AMR Parsing Evaluation Suite
Jonas Groschwitz | Shay Cohen | Lucia Donatelli | Meaghan Fowlie
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We present the Granular AMR Parsing Evaluation Suite (GrAPES), a challenge set for Abstract Meaning Representation (AMR) parsing with accompanying evaluation metrics. AMR parsers now obtain high scores on the standard AMR evaluation metric Smatch, close to or even above reported inter-annotator agreement. But that does not mean that AMR parsing is solved; in fact, human evaluation in previous work indicates that current parsers still quite frequently make errors on node labels or graph structure that substantially distort sentence meaning. Here, we provide an evaluation suite that tests AMR parsers on a range of phenomena of practical, technical, and linguistic interest. Our 36 categories range from seen and unseen labels, to structural generalization, to coreference. GrAPES reveals in depth the abilities and shortcomings of current AMR parsers.


pdf bib
Abstract Meaning Representation for Gesture
Richard Brutti | Lucia Donatelli | Kenneth Lai | James Pustejovsky
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents Gesture AMR, an extension to Abstract Meaning Representation (AMR), that captures the meaning of gesture. In developing Gesture AMR, we consider how gesture form and meaning relate; how gesture packages meaning both independently and in interaction with speech; and how the meaning of gesture is temporally and contextually determined. Our case study for developing Gesture AMR is a focused human-human shared task to build block structures. We develop an initial taxonomy of gesture act relations that adheres to AMR’s existing focus on predicate-argument structure while integrating meaningful elements unique to gesture. Pilot annotation shows Gesture AMR to be more challenging than standard AMR, and illustrates the need for more work on representation of dialogue and multimodal meaning. We discuss challenges of adapting an existing meaning representation to non-speech-based modalities and outline several avenues for expanding Gesture AMR.

pdf bib
Proceedings of the 29th International Conference on Computational Linguistics
Nicoletta Calzolari | Chu-Ren Huang | Hansaem Kim | James Pustejovsky | Leo Wanner | Key-Sun Choi | Pum-Mo Ryu | Hsin-Hsi Chen | Lucia Donatelli | Heng Ji | Sadao Kurohashi | Patrizia Paggio | Nianwen Xue | Seokhwan Kim | Younggyun Hahm | Zhong He | Tony Kyungil Lee | Enrico Santus | Francis Bond | Seung-Hoon Na
Proceedings of the 29th International Conference on Computational Linguistics

pdf bib
Compositional generalization with a broad-coverage semantic parser
Pia Weißenhorn | Lucia Donatelli | Alexander Koller
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

We show how the AM parser, a compositional semantic parser (Groschwitz et al., 2018) can solve compositional generalization on the COGS dataset. It is the first semantic parser that achieves high accuracy on both naturally occurring language and the synthetic COGS dataset. We discuss implications for corpus and model design for learning human-like generalization. Our results suggest that compositional generalization can be best achieved by building compositionality into semantic parsers.

pdf bib
Spanish Abstract Meaning Representation: Annotation of a General Corpus
Shira Wein | Lucia Donatelli | Ethan Ricker | Calvin Engstrom | Alex Nelson | Leonie Harter | Nathan Schneider
Northern European Journal of Language Technology, Volume 8

Abstract Meaning Representation (AMR), originally designed for English, has been adapted to a number of languages to facilitate cross-lingual semantic representation and analysis. We build on previous work and present the first sizable, general annotation project for Spanish AMR. We release a detailed set of annotation guidelines and a corpus of 486 gold-annotated sentences spanning multiple genres from an existing, cross-lingual AMR corpus. Our work constitutes the second largest non-English gold AMR corpus to date. Fine-tuning an AMR to-Spanish generation model with our annotations results in a BERTScore improvement of 8.8%, demonstrating initial utility of our work.

pdf bib
On the Cusp of Comprehensibility: Can Language Models Distinguish Between Metaphors and Nonsense?
Bernadeta Griciūtė | Marc Tanti | Lucia Donatelli
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)

Utterly creative texts can sometimes be difficult to understand, balancing on the edge of comprehensibility. However, good language skills and common sense allow advanced language users both to interpret creative texts and to reject some linguistic input as nonsense. The goal of this paper is to evaluate whether the current language models are also able to make the distinction between a creative language use and nonsense. To test this, we have computed mean rank and pseudo-log-likelihood score (PLL) of metaphorical and nonsensical sentences, and fine-tuned several pretrained models (BERT, RoBERTa) for binary classification between the two categories. There was a significant difference in the mean ranks and PPL scores of the categories, and the classifier reached around 85.5% accuracy. The results raise further questions on what could have let to such satisfactory performance.


pdf bib
Aligning Actions Across Recipe Graphs
Lucia Donatelli | Theresa Schmidt | Debanjali Biswas | Arne Köhn | Fangzhou Zhai | Alexander Koller
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recipe texts are an idiosyncratic form of instructional language that pose unique challenges for automatic understanding. One challenge is that a cooking step in one recipe can be explained in another recipe in different words, at a different level of abstraction, or not at all. Previous work has annotated correspondences between recipe instructions at the sentence level, often glossing over important correspondences between cooking steps across recipes. We present a novel and fully-parsed English recipe corpus, ARA (Aligned Recipe Actions), which annotates correspondences between individual actions across similar recipes with the goal of capturing information implicit for accurate recipe understanding. We represent this information in the form of recipe graphs, and we train a neural model for predicting correspondences on ARA. We find that substantial gains in accuracy can be obtained by taking fine-grained structural information about the recipes into account.

pdf bib
Representing Implicit Positive Meaning of Negated Statements in AMR
Katharina Stein | Lucia Donatelli
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

Abstract Meaning Representation (AMR) has become popular for representing the meaning of natural language in graph structures. However, AMR does not represent scope information, posing a problem for its overall expressivity and specifically for drawing inferences from negated statements. This is the case with so-called “positive interpretations” of negated statements, in which implicit positive meaning is identified by inferring the opposite of the negation’s focus. In this work, we investigate how potential positive interpretations (PPIs) can be represented in AMR. We propose a logically motivated AMR structure for PPIs that makes the focus of negation explicit and sketch an initial proposal for a systematic methodology to generate this more expressive structure.

pdf bib
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)
Lucia Donatelli | Nikhil Krishnaswamy | Kenneth Lai | James Pustejovsky
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)


pdf bib
A Continuation Semantics for Abstract Meaning Representation
Kenneth Lai | Lucia Donatelli | James Pustejovsky
Proceedings of the Second International Workshop on Designing Meaning Representations

Abstract Meaning Representation (AMR) is a simple, expressive semantic framework whose emphasis on predicate-argument structure is effective for many tasks. Nevertheless, AMR lacks a systematic treatment of projection phenomena, making its translation into logical form problematic. We present a translation function from AMR to first order logic using continuation semantics, which allows us to capture the semantic context of an expression in the form of an argument. This is a natural extension of AMR’s original design principles, allowing us to easily model basic projection phenomena such as quantification and negation as well as complex phenomena such as bound variables and donkey anaphora.

pdf bib
Dialogue-AMR: Abstract Meaning Representation for Dialogue
Claire Bonial | Lucia Donatelli | Mitchell Abrams | Stephanie M. Lukin | Stephen Tratz | Matthew Marge | Ron Artstein | David Traum | Clare Voss
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.

pdf bib
Normalizing Compositional Structures Across Graphbanks
Lucia Donatelli | Jonas Groschwitz | Matthias Lindemann | Alexander Koller | Pia Weißenhorn
Proceedings of the 28th International Conference on Computational Linguistics

The emergence of a variety of graph-based meaning representations (MRs) has sparked an important conversation about how to adequately represent semantic structure. MRs exhibit structural differences that reflect different theoretical and design considerations, presenting challenges to uniform linguistic analysis and cross-framework semantic parsing. Here, we ask the question of which design differences between MRs are meaningful and semantically-rooted, and which are superficial. We present a methodology for normalizing discrepancies between MRs at the compositional level (Lindemann et al., 2019), finding that we can normalize the majority of divergent phenomena using linguistically-grounded rules. Our work significantly increases the match in compositional structure between MRs and improves multi-task learning (MTL) in a low-resource setting, serving as a proof of concept for future broad-scale cross-MR normalization.

pdf bib
A Two-Level Interpretation of Modality in Human-Robot Dialogue
Lucia Donatelli | Kenneth Lai | James Pustejovsky
Proceedings of the 28th International Conference on Computational Linguistics

We analyze the use and interpretation of modal expressions in a corpus of situated human-robot dialogue and ask how to effectively represent these expressions for automatic learning. We present a two-level annotation scheme for modality that captures both content and intent, integrating a logic-based, semantic representation and a task-oriented, pragmatic representation that maps to our robot’s capabilities. Data from our annotation task reveals that the interpretation of modal expressions in human-robot dialogue is quite diverse, yet highly constrained by the physical environment and asymmetrical speaker/addressee relationship. We sketch a formal model of human-robot common ground in which modality can be grounded and dynamically interpreted.

pdf bib
Graph-to-Graph Meaning Representation Transformations for Human-Robot Dialogue
Mitchell Abrams | Claire Bonial | Lucia Donatelli
Proceedings of the Society for Computation in Linguistics 2020


pdf bib
Abstract Meaning Representation for Human-Robot Dialogue
Claire N. Bonial | Lucia Donatelli | Jessica Ervin | Clare R. Voss
Proceedings of the Society for Computation in Linguistics (SCiL) 2019

pdf bib
Augmenting Abstract Meaning Representation for Human-Robot Dialogue
Claire Bonial | Lucia Donatelli | Stephanie M. Lukin | Stephen Tratz | Ron Artstein | David Traum | Clare Voss
Proceedings of the First International Workshop on Designing Meaning Representations

We detail refinements made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. We propose 36 augmented AMRs that capture speech acts, tense and aspect, and spatial information. This linguistic information is vital for representing important distinctions, for example whether the robot has moved, is moving, or will move. We evaluate two existing AMR parsers for their performance on dialogue data. We also outline a model for graph-to-graph conversion, in which output from AMR parsers is converted into our refined AMRs. The design scheme presented here, though task-specific, is extendable for broad coverage of speech acts using AMR in future task-independent work.

pdf bib
Saarland at MRP 2019: Compositional parsing across all graphbanks
Lucia Donatelli | Meaghan Fowlie | Jonas Groschwitz | Alexander Koller | Matthias Lindemann | Mario Mina | Pia Weißenhorn
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

We describe the Saarland University submission to the shared task on Cross-Framework Meaning Representation Parsing (MRP) at the 2019 Conference on Computational Natural Language Learning (CoNLL).


pdf bib
Towards a Computational Lexicon for Moroccan Darija: Words, Idioms, and Constructions
Jamal Laoudi | Claire Bonial | Lucia Donatelli | Stephen Tratz | Clare Voss
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

In this paper, we explore the challenges of building a computational lexicon for Moroccan Darija (MD), an Arabic dialect spoken by over 32 million people worldwide but which only recently has begun appearing frequently in written form in social media. We raise the question of what belongs in such a lexicon and start by describing our work building traditional word-level lexicon entries with their English translations. We then discuss challenges in translating idiomatic MD text that led to creating multi-word expression lexicon entries whose meanings could not be fully derived from the individual words. Finally, we provide a preliminary exploration of constructions to be considered for inclusion in an MD constructicon by translating examples of English constructions and examining their MD counterparts.

pdf bib
Annotation of Tense and Aspect Semantics for Sentential AMR
Lucia Donatelli | Michael Regan | William Croft | Nathan Schneider
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

Although English grammar encodes a number of semantic contrasts with tense and aspect marking, these semantics are currently ignored by Abstract Meaning Representation (AMR) annotations. This paper extends sentence-level AMR to include a coarse-grained treatment of tense and aspect semantics. The proposed framework augments the representation of finite predications to include a four-way temporal distinction (event time before, up to, at, or after speech time) and several aspectual distinctions (including static vs. dynamic, habitual vs. episodic, and telic vs. atelic). This will enable AMR to be used for NLP tasks and applications that require sophisticated reasoning about time and event structure.