Veronika Smilga

2024

TüDuo at SemEval-2024 Task 2: Flan-T5 and Data Augmentation for Biomedical NLI
Veronika Smilga | Hazem Alabiad
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper explores using data augmentation with smaller language models under 3 billion parameters for the SemEval-2024 Task 2 on Biomedical Natural Language Inference for Clinical Trials. We fine-tune models from the Flan-T5 family with and without using augmented data automatically generated by GPT-3.5-Turbo and find that data augmentation through techniques like synonym replacement, syntactic changes, adding random facts, and meaning reversion improves model faithfulness (ability to change predictions for semantically different inputs) and consistency (ability to give same predictions for semantic preserving changes). However, data augmentation tends to decrease performance on the original dataset distribution, as measured by F1 score. Our best system is the Flan-T5 XL model fine-tuned on the original training data combined with over 6,000 augmented examples. The system ranks in the top 10 for all three metrics.

2023

pdf bib abs

ChatGPT vs. Crowdsourcing vs. Experts: Annotating Open-Domain Conversations with Speech Functions
Lidiia Ostyakova | Veronika Smilga | Kseniia Petukhova | Maria Molchanova | Daniel Kornev
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

This paper deals with the task of annotating open-domain conversations with speech functions. We propose a semi-automated method for annotating dialogs following the topic-oriented, multi-layered taxonomy of speech functions with the use of hierarchical guidelines using Large Language Models. These guidelines comprise simple questions about the topic and speaker change, sentence types, pragmatic aspects of the utterance, and examples that aid untrained annotators in understanding the taxonomy. We compare the results of dialog annotation performed by experts, crowdsourcing workers, and ChatGPT. To improve the performance of ChatGPT, several experiments utilising different prompt engineering techniques were conducted. We demonstrate that in some cases large language models can achieve human-like performance following a multi-step tree-like annotation pipeline on complex discourse annotation, which is usually challenging and costly in terms of time and money when performed by humans.

pdf bib abs

An open-source DeepPavlov Dream Platform is specifically tailored for development of complex dialog systems like Generative AI Assistants. The stack prioritizes efficiency, modularity, scalability, and extensibility with the goal to make it easier to develop complex dialog systems from scratch. It supports modular approach to implementation of conversational agents enabling their development through the choice of NLP components and conversational skills from a rich library organized into the distributions of ready-for-use multi-skill AI assistant systems. In DeepPavlov Dream, multi-skill Generative AI Assistant consists of NLP components that extract features from user utterances, conversational skills that generate or retrieve a response, skill and response selectors that facilitate choice of relevant skills and the best response, as well as a conversational orchestrator that enables creation of multi-skill Generative AI Assistants scalable up to industrial grade AI assistants. The platform allows to integrate large language models into dialog pipeline, customize with prompt engineering, handle multiple prompts during the same dialog session and create simple multimodal assistants.

Co-authors

Venues

Fix author