Branislav Pecher


2024

pdf bib
On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices
Branislav Pecher | Ivan Srba | Maria Bielikova
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

While learning with limited labelled data can effectively deal with a lack of labels, it is also sensitive to the effects of uncontrolled randomness introduced by so-called randomness factors (i.e., non-deterministic decisions such as choice or order of samples). We propose and formalise a method to systematically investigate the effects of individual randomness factors while taking the interactions (dependence) between them into consideration. To this end, our method mitigates the effects of other factors while observing how the performance varies across multiple runs. Applying our method to multiple randomness factors across in-context learning and fine-tuning approaches on 7 representative text classification tasks and meta-learning on 3 tasks, we show that: 1) disregarding interactions between randomness factors in existing works led to inconsistent findings due to incorrect attribution of the effects of randomness factors, such as disproving the consistent sensitivity of in-context learning to sample order even with random sample selection; and 2) besides mutual interactions, the effects of randomness factors, especially sample order, are also dependent on more systematic choices unexplored in existing works, such as number of classes, samples per class or choice of prompt format.

pdf bib
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher | Jan Cegin | Robert Belanec | Jakub Simko | Ivan Srba | Maria Bielikova
Findings of the Association for Computational Linguistics: EMNLP 2024

While fine-tuning of pre-trained language models generally helps to overcome the lack of labelled training samples, it also displays model performance instability. This instability mainly originates from randomness in initialisation or data shuffling. To address this, researchers either modify the training process or augment the available samples, which typically results in increased computational costs. We propose a new mitigation strategy, called **Delayed Ensemble with Noisy Interpolation (DENI)**, that leverages the strengths of ensembling, noise regularisation and model interpolation, while retaining computational efficiency. We compare DENI with 9 representative mitigation strategies across 3 models, 4 tuning strategies and 7 text classification datasets. We show that: 1) DENI outperforms the best performing mitigation strategy (Ensemble), while using only a fraction of its cost; 2) the mitigation strategies are beneficial for parameter-efficient fine-tuning (PEFT) methods, outperforming full fine-tuning in specific cases; and 3) combining DENI with data augmentation often leads to even more effective instability mitigation.

pdf bib
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
Jan Cegin | Branislav Pecher | Jakub Simko | Ivan Srba | Maria Bielikova | Peter Brusilovsky
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The latest generative large language models (LLMs) have found their application in data augmentation tasks, where small numbers of text samples are LLM-paraphrased and then used to fine-tune downstream models. However, more research is needed to assess how different prompts, seed data selection strategies, filtering methods, or model settings affect the quality of paraphrased data (and downstream models). In this study, we investigate three text diversity incentive methods well established in crowdsourcing: taboo words, hints by previous outlier solutions, and chaining on previous outlier solutions. Using these incentive methods as part of instructions to LLMs augmenting text datasets, we measure their effects on generated texts’ lexical diversity and downstream model performance. We compare the effects over 5 different LLMs, 6 datasets and 2 downstream models. We show that diversity is most increased by taboo words, but downstream model performance is highest with hints.

2023

pdf bib
KInITVeraAI at SemEval-2023 Task 3: Simple yet Powerful Multilingual Fine-Tuning for Persuasion Techniques Detection
Timo Hromadka | Timotej Smolen | Tomas Remis | Branislav Pecher | Ivan Srba
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper presents the best-performing solution to the SemEval 2023 Task 3 on the subtask 3 dedicated to persuasion techniques detection. Due to a high multilingual character of the input data and a large number of 23 predicted labels (causing a lack of labelled data for some language-label combinations), we opted for fine-tuning pre-trained transformer-based language models. Conducting multiple experiments, we find the best configuration, which consists of large multilingual model (XLM-RoBERTa large) trained jointly on all input data, with carefully calibrated confidence thresholds for seen and surprise languages separately. Our final system performed the best on 6 out of 9 languages (including two surprise languages) and achieved highly competitive results on the remaining three languages.