Ira Assent

2025

Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
Amalie Brogaard Pauli | Isabelle Augenstein | Ira Assent
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) make it easy to rewrite a text in any style – e.g. to make it more polite, persuasive, or more positive – but evaluation thereof is not straightforward. A challenge lies in measuring content preservation: that content not attributable to style change is retained. This paper presents a large meta-evaluation of metrics for evaluating style and attribute transfer, focusing on content preservation. We find that meta-evaluation studies on existing datasets lead to misleading conclusions about the suitability of metrics for content preservation. Widely used metrics show a high correlation with human judgments despite being deemed unsuitable for the task – because they do not abstract from style changes when evaluating content preservation. We show that the overly high correlations with human judgment stem from the nature of the test data. To address this issue, we introduce a new, challenging test set specifically designed for evaluating content preservation metrics for style transfer. We construct the data by creating high variation in the content preservation. Using this dataset, we demonstrate that suitable metrics for content preservation for style transfer indeed are style-aware.To support efficient evaluation, we propose a new style-aware method that utilises small language models, obtaining a higher alignment with human judgements than prompting a model of a similar size as an autorater.

pdf bib abs

Measuring and Benchmarking Large Language Models’ Capabilities to Generate Persuasive Language
Amalie Brogaard Pauli | Isabelle Augenstein | Ira Assent
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

We are exposed to much information trying to influence us, such as teaser messages, debates, politically framed news, and propaganda — all of which use persuasive language. With the recent interest in Large Language Models (LLMs), we study the ability of LLMs to produce persuasive text. As opposed to prior work which focuses on particular domains or types of persuasion, we conduct a general study across various domains to measure and benchmark to what degree LLMs produce persuasive language - both when explicitly instructed to rewrite text to be more or less persuasive and when only instructed to paraphrase. We construct the new dataset Persuasive-Pairs of pairs of a short text and its rewrite by an LLM to amplify or diminish persuasive language. We multi-annotate the pairs on a relative scale for persuasive language: a valuable resource in itself, and for training a regression model to score and benchmark persuasive language, including for new LLMs across domains. In our analysis, we find that different ‘personas’ in LLaMA3’s system prompt change persuasive language substantially, even when only instructed to paraphrase.

2023

pdf bib abs

Anchoring Fine-tuning of Sentence Transformer with Semantic Label Information for Efficient Truly Few-shot Classification
Amalie Pauli | Leon Derczynski | Ira Assent
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Few-shot classification is a powerful technique, but training requires substantial computing power and data. We propose an efficient method with small model sizes and less training data with only 2-8 training instances per class. Our proposed method, AncSetFit, targets low data scenarios by anchoring the task and label information through sentence embeddings in fine-tuning a Sentence Transformer model. It uses contrastive learning and a triplet loss to enforce training instances of a class to be closest to its own textual semantic label information in the embedding space - and thereby learning to embed different class instances more distinct. AncSetFit obtains strong performance in data-sparse scenarios compared to existing methods across SST-5, Emotion detection, and AG News data, even with just two examples per class.

pdf bib abs

Sren Kierkegaard at SemEval-2023 Task 4: Label-aware text classification using Natural Language Inference
Ignacio Talavera Cepeda | Amalie Pauli | Ira Assent
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we describe our approach to Task 4 in SemEval 2023. Our pipeline tries to solve the problem of multi-label text classification of human values in English-written arguments. We propose a label-aware system where we reframe the multi-label task into a binary task resembling an NLI task. We propose to include the semantic description of the human values by comparing each description to each argument and ask whether there is entailment or not.

pdf bib abs

TeamAmpa at SemEval-2023 Task 3: Exploring Multilabel and Multilingual RoBERTa Models for Persuasion and Framing Detection
Amalie Pauli | Rafael Sarabia | Leon Derczynski | Ira Assent
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our submission to theSemEval 2023 Task 3 on two subtasks: detectingpersuasion techniques and framing. Bothsubtasks are multi-label classification problems. We present a set of experiments, exploring howto get robust performance across languages usingpre-trained RoBERTa models. We test differentoversampling strategies, a strategy ofadding textual features from predictions obtainedwith related models, and present bothinconclusive and negative results. We achievea robust ranking across languages and subtaskswith our best ranking being nr. 1 for Subtask 3on Spanish.

2022

pdf bib abs

Modelling Persuasion through Misuse of Rhetorical Appeals
Amalie Pauli | Leon Derczynski | Ira Assent
Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)

It is important to understand how people use words to persuade each other. This helps understand debate, and detect persuasive narratives in regard to e.g. misinformation. While computational modelling of some aspects of persuasion has received some attention, a way to unify and describe the overall phenomenon of when persuasion becomes undesired and problematic, is missing. In this paper, we attempt to address this by proposing a taxonomy of computational persuasion. Drawing upon existing research and resources, this paper shows how to re-frame and re-organise current work into a coherent framework targeting the misuse of rhetorical appeals. As a study to validate these re-framings, we then train and evaluate models of persuasion adapted to our taxonomy. Our results show an application of our taxonomy, and we are able to detecting misuse of rhetorical appeals, finding that these are more often used in misinformative contexts than in true ones.

2021

pdf bib abs

A reproduction of Apple’s bi-directional LSTM models for language identification in short strings
Mads Toftrup | Søren Asger Sørensen | Manuel R. Ciosici | Ira Assent
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop

Language Identification is the task of identifying a document’s language. For applications like automatic spell checker selection, language identification must use very short strings such as text message fragments. In this work, we reproduce a language identification architecture that Apple briefly sketched in a blog post. We confirm the bi-LSTM model’s performance and find that it outperforms current open-source language identifiers. We further find that its language identification mistakes are due to confusion between related languages.

2020

pdf bib abs

A Real-World Data Resource of Complex Sensitive Sentences Based on Documents from the Monsanto Trial
Jan Neerbek | Morten Eskildsen | Peter Dolog | Ira Assent
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this work we present a corpus for the evaluation of sensitive information detection approaches that addresses the need for real world sensitive information for empirical studies. Our sentence corpus contains different notions of complex sensitive information that correspond to different aspects of concern in a current trial of the Monsanto company. This paper describes the annotations process, where we both employ human annotators and furthermore create automatically inferred labels regarding technical, legal and informal communication within and with employees of Monsanto, drawing on a classification of documents by lawyers involved in the Monsanto court case. We release corpus of high quality sentences and parse trees with these two types of labels on sentence level. We characterize the sensitive information via several representative sensitive information detection models, in particular both keyword-based (n-gram) approaches and recent deep learning models, namely, recurrent neural networks (LSTM) and recursive neural networks (RecNN). Data and code are made publicly available.

pdf bib abs

One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations
Jesper Brink Andersen | Mikkel Bak Bertelsen | Mikkel Hørby Schou | Manuel R. Ciosici | Ira Assent
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems

Word embeddings are an active topic in the NLP research community. State-of-the-art neural models achieve high performance on downstream tasks, albeit at the cost of computationally expensive training. Cost aware solutions require cheaper models that still achieve good performance. We present several reproduction studies of intrinsic evaluation tasks that evaluate non-contextual word representations in multiple languages. Furthermore, we present 50-8-8, a new data set for the outlier identification task, which avoids limitations of the original data set, such as ambiguous words, infrequent words, and multi-word tokens, while increasing the number of test cases. The data set is expanded to contain semantic and syntactic tests and is multilingual (English, German, and Italian). We provide an in-depth analysis of word embedding models with a range of hyper-parameters. Our analysis shows the suitability of different models and hyper-parameters for different tasks and the greater difficulty of representing German and Italian languages.

pdf bib abs

Accelerated High-Quality Mutual-Information Based Word Clustering
Manuel R. Ciosici | Ira Assent | Leon Derczynski
Proceedings of the Twelfth Language Resources and Evaluation Conference

Word clustering groups words that exhibit similar properties. One popular method for this is Brown clustering, which uses short-range distributional information to construct clusters. Specifically, this is a hard hierarchical clustering with a fixed-width beam that employs bi-grams and greedily minimizes global mutual information loss. The result is word clusters that tend to outperform or complement other word representations, especially when constrained by small datasets. However, Brown clustering has high computational complexity and does not lend itself to parallel computation. This, together with the lack of efficient implementations, limits their applicability in NLP. We present efficient implementations of Brown clustering and the alternative Exchange clustering as well as a number of methods to accelerate the computation of both hierarchical and flat clusters. We show empirically that clusters obtained with the accelerated method match the performance of clusters computed using the original methods.

2019

pdf bib abs

Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation
Manuel R. Ciosici | Ira Assent
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

We present Abbreviation Explorer, a system that supports interactive exploration of abbreviations that are challenging for Unsupervised Abbreviation Disambiguation (UAD). Abbreviation Explorer helps to identify long-forms that are easily confused, and to pinpoint likely causes such as limitations of normalization, language switching, or inconsistent typing. It can also support determining which long-forms would benefit from additional input text for unsupervised abbreviation disambiguation. The system provides options for creating corrective rules that merge redundant long-forms with identical meaning. The identified rules can be easily applied to the already existing vector spaces used by UAD to improve disambiguation performance, while also avoiding the cost of retraining.

pdf bib abs

Quantifying the morphosyntactic content of Brown Clusters
Manuel R. Ciosici | Leon Derczynski | Ira Assent
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Brown and Exchange word clusters have long been successfully used as word representations in Natural Language Processing (NLP) systems. Their success has been attributed to their seeming ability to represent both semantic and syntactic information. Using corpora representing several language families, we test the hypothesis that Brown and Exchange word clusters are highly effective at encoding morphosyntactic information. Our experiments show that word clusters are highly capable at distinguishing Parts of Speech. We show that increases in Average Mutual Information, the clustering algorithms’ optimization goal, are highly correlated with improvements in encoding of morphosyntactic information. Our results provide empirical evidence that downstream NLP systems addressing tasks dependent on morphosyntactic information can benefit from word cluster features.

2018

pdf bib abs

Abbreviation Expander - a Web-based System for Easy Reading of Technical Documents
Manuel R. Ciosici | Ira Assent
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

Abbreviations and acronyms are a part of textual communication in most domains. However, abbreviations are not necessarily defined in documents that employ them. Understanding all abbreviations used in a given document often requires extensive knowledge of the target domain and the ability to disambiguate based on context. This creates considerable entry barriers to newcomers and difficulties in automated document processing. Existing abbreviation expansion systems or tools require substantial technical knowledge for set up or make strong assumptions which limit their use in practice. Here, we present Abbreviation Expander, a system that builds on state of the art methods for identification of abbreviations, acronyms and their definitions and a novel disambiguator for abbreviation expansion in an easily accessible web-based solution.