Diana Trandabat

Also published as: Diana Trandabăţ, Diana Marie Trandabăţ, Diana Trandabăț, Diana Trandăbăț

2025

FII the Best at SemEval 2025 Task 2: Steering State-of-the-art Machine Translation Models with Strategically Engineered Pipelines for Enhanced Entity Translation
Delia - Iustina Grigorita | Tudor - Constantin Pricop | Sergio - Alessandro Suteu | Daniela Gifu | Diana Trandabat
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Entity-Aware Machine Translation (EAMT) aims to enhance the accuracy of machine translation (MT) systems in handling named entities, including proper names, domain-specific terms, and structured references. Conventional MT models often struggle to accurately translate these entities, leading to errors that affect comprehension and reliability. In this paper, we present a promising approach for SemEval 2025 Task 2, focusing on improving EAMT in ten target languages. The methodology is based on two complementary strategies: (1) multilingual Named Entity Recognition (NER) and structured knowledge bases for preprocessing and integrating entity translations, and (2) large language models (LLMs) enhanced with optimized prompts and validation mechanisms to improve entity preservation. By combining structured knowledge with neural approaches, this system aims to mitigate entity-related translation errors and enhance the overall performance of MT models. Among the systems that do not use gold information, retrieval-augmented generation (RAG), or fine-tuning, our approach ranked 1st with the second strategy and 3rd with the first strategy.

2024

pdf bib abs

LinguisTech at SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversation
Mihaela Alexandru | Călina Ciocoiu | Ioana Măniga | Octavian Ungureanu | Daniela Gîfu | Diana Trandăbăț
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

The “Emotion Discovery and Reasoning Its Flip in Conversation” task at the SemEval 2024 competition focuses on the automatic recognition of emotion flips, triggered within multi-party textual conversations. This paper proposes a novel approach that draws a parallel between a mixed strategy and a comparative strategy, contrasting a Rule-Based Function with Named Entity Recognition (NER)—an approach that shows promise in understanding speaker-specific emotional dynamics. Furthermore, this method surpasses the performance of both DistilBERT and RoBERTa models, demonstrating competitive effectiveness in detecting emotion flips triggered in multi-party textual conversations, achieving a 70% F1-score. This system was ranked 6th in the SemEval 2024 competition for Subtask 3.

2023

pdf bib abs

FII_Better at SemEval-2023 Task 2: MultiCoNER II Multilingual Complex Named Entity Recognition
Viorica-Camelia Lupancu | Alexandru-Gabriel Platica | Cristian-Mihai Rosu | Daniela Gifu | Diana Trandabat
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This task focuses on identifying complex named entities (NEs) in several languages. In the context of SemEval-2023 competition, our team presents an exploration of a base transformer model’s capabilities regarding the task, focused more specifically on five languages (English, Spanish, Swedish, German, Italian). We take DistilBERT and BERT as two examples of basic transformer models, using DistilBERT as a baseline and BERT as the platform to create an improved model. The dataset that we are using, MultiCoNER II, is a large multilingual dataset used for NER, that covers domains like: Wiki sentences, questions and search queries across 12 languages. This dataset contains 26M tokens and it is assembled from public resources. MultiCoNER II defines a NER tag-set with 6 classes and 67 tags. We have managed to get moderate results in the English track (we ranked 17th out of 34), while our results in the other tracks could be further improved in the future (overall third to last).

pdf bib abs

FII SMART at SemEval 2023 Task7: Multi-evidence Natural Language Inference for Clinical Trial Data
Mihai Volosincu | Cosmin Lupu | Diana Trandabat | Daniela Gifu
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

The “Multi-evidence Natural Language Inference forClinical Trial Data” task at SemEval 2023competition focuses on extracting essentialinformation on clinical trial data, by posing twosubtasks on textual entailment and evidence retrieval. In the context of SemEval, we present a comparisonbetween a method based on the BioBERT model anda CNN model. The task is based on a collection ofbreast cancer Clinical Trial Reports (CTRs),statements, explanations, and labels annotated bydomain expert annotators. We achieved F1 scores of0.69 for determining the inference relation(entailment vs contradiction) between CTR -statement pairs. The implementation of our system ismade available via Github - https://github.com/volosincu/FII_Smart__Semeval2023.

pdf bib abs

Togedemaru at SemEval-2023 Task 8: Causal Medical Claim Identification and Extraction from Social Media Posts
Andra Oica | Daniela Gifu | Diana Trandabat
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

The “Causal Medical Claim Identification and Extraction from Social Media Posts task at SemEval 2023 competition focuses on identifying and validating medical claims in English, by posing two subtasks on causal claim identification and PIO (Population, Intervention, Outcome) frame extraction. In the context of SemEval, we present a method for sentence classification in four categories (claim, experience, experience_based_claim or a question) based on BioBERT model with a MLP layer. The website from which the dataset was gathered, Reddit, is a social news and content discussion site. The evaluation results show the effectiveness of the solution of this study (83.68%).

2021

pdf bib abs

FII_CROSS at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation
Ciprian Bodnar | Andrada Tapuc | Cosmin Pintilie | Daniela Gifu | Diana Trandabat
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper presents a word-in-context disambiguation system. The task focuses on capturing the polysemous nature of words in a multilingual and cross-lingual setting, without considering a strict inventory of word meanings. The system applies Natural Language Processing algorithms on datasets from SemEval 2021 Task 2, being able to identify the meaning of words for the languages Arabic, Chinese, English, French and Russian, without making use of any additional mono- or multilingual resources.

2020

pdf bib abs

This paper describes the on-going work carried out within the CoBiLiRo (Bimodal Corpus for Romanian Language) research project, part of ReTeRom (Resources and Technologies for Developing Human-Machine Interfaces in Romanian). Data annotation finds increasing use in speech recognition and synthesis with the goal to support learning processes. In this context, a variety of different annotation systems for application to Speech and Text Processing environments have been presented. Even if many designs for the data annotations workflow have emerged, the process of handling metadata, to manage complex user-defined annotations, is not covered enough. We propose a design of the format aimed to serve as an annotation standard for bimodal resources, which facilitates searching, editing and statistical analysis operations over it. The design and implementation of an infrastructure that houses the resources are also presented. The goal is widening the dissemination of bimodal corpora for research valorisation and use in applications. Also, this study reports on the main operations of the web Platform which hosts the corpus and the automatic conversion flows that brings the submitted files at the format accepted by the Platform.

2019

pdf bib abs

Hope at SemEval-2019 Task 6: Mining social media language to discover offensive language
Gabriel Florentin Patras | Diana Florina Lungu | Daniela Gifu | Diana Trandabat
Proceedings of the 13th International Workshop on Semantic Evaluation

User’s content share through social media has reached huge proportions nowadays. However, along with the free expression of thoughts on social media, people risk getting exposed to various aggressive statements. In this paper, we present a system able to identify and classify offensive user-generated content.

2018

pdf bib abs

EmoIntens Tracker at SemEval-2018 Task 1: Emotional Intensity Levels in #Tweets
Ramona-Andreea Turcu | Sandra Maria Amarandei | Iuliana-Alexandra Flescan-Lovin-Arseni | Daniela Gifu | Diana Trandabat
Proceedings of the 12th International Workshop on Semantic Evaluation

The „Affect in Tweets” task is centered on emotions categorization and evaluation matrix using multi-language tweets (English and Spanish). In this research, SemEval Affect dataset was preprocessed, categorized, and evaluated accordingly (precision, recall, and accuracy). The system described in this paper is based on the implementation of supervised machine learning (Naive Bayes, KNN and SVM), deep learning (NN Tensor Flow model), and decision trees algorithms.

pdf bib abs

The Dabblers at SemEval-2018 Task 2: Multilingual Emoji Prediction
Larisa Alexa | Alina Lorenț | Daniela Gîfu | Diana Trandabăț
Proceedings of the 12th International Workshop on Semantic Evaluation

The “Multilingual Emoji Prediction” task focuses on the ability of predicting the correspondent emoji for a certain tweet. In this paper, we investigate the relation between words and emojis. In order to do that, we used supervised machine learning (Naive Bayes) and deep learning (Recursive Neural Network).

pdf bib abs

Apollo at SemEval-2018 Task 9: Detecting Hypernymy Relations Using Syntactic Dependencies
Mihaela Onofrei | Ionuț Hulub | Diana Trandabăț | Daniela Gîfu
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents the participation of Apollo’s team in the SemEval-2018 Task 9 “Hypernym Discovery”, Subtask 1: “General-Purpose Hypernym Discovery”, which tries to produce a ranked list of hypernyms for a specific term. We propose a novel approach for automatic extraction of hypernymy relations from a corpus by using dependency patterns. We estimated that the application of these patterns leads to a higher score than using the traditional lexical patterns.

2017

pdf bib abs

This paper presents Wild Devs’ participation in the SemEval-2017 Task 2 “Multi-lingual and Cross-lingual Semantic Word Similarity”, which tries to automatically measure the semantic similarity between two words. The system was build using neural networks, having as input a collection of word pairs, whereas the output consists of a list of scores, from 0 to 4, corresponding to the degree of similarity between the word pairs.

pdf bib abs

This paper presents the participation of #WarTeam in Task 6 of SemEval2017 with a system classifying humor by comparing and ranking tweets. The training data consists of annotated tweets from the @midnight TV show. #WarTeam’s system uses a neural network (TensorFlow) having inputs from a Naïve Bayes humor classifier and a sentiment analyzer.

Semantic databases are a stable starting point in developing knowledge based systems. Since creating language resources demands many temporal, financial and human resources, a possible solution could be the import of a resource annotation from one language to another. This paper presents the creation of a semantic role database for Romanian, starting from the English FrameNet semantic resource. The intuition behind the importing program is that most of the frames defined in the English FN are likely to be valid cross-lingual, since semantic frames express conceptual structures, language independent at the deep structure level. The surface realization, the surface level, is realized according to each language syntactic constraints. In the paper we present the advantages of choosing to import the English FrameNet annotation, instead of annotating a new corpus. We also take into account the mismatches encountered in the validation process. The rules created to manage particular situations are used to improve the import program. We believe the information and argumentations in this paper could be of interest for those who wish develop FrameNet-like systems for other languages.