pdf
bib
Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Boris Velichkov
|
Ivelina Nikolova-Koleva
|
Milena Slavcheva
pdf
bib
abs
A Multi-Baseline Framework for Ranking Global Event Significance Using Google Trends and Large Language Models
Zenan Chen
Determining global event significance lacks standardized metrics for quantifying worldwide impact. While Google Trends has demonstrated utility in domain-specific studies, its application to global event ranking remains limited. This paper presents a framework combining Google Trends data with large language models for automated global event ranking. This study leverages Command R+ and Llama 3.3-70B-Instruct to generate contextually relevant event keywords and establishes significance through comparative search volume analysis against baseline reference terms, incorporating temporal weighting mechanisms to address chronological biases. The proposed methodology identified globally significant events across technology, health, sports, and natural disasters from a dataset of 1,094 events (2020-2024) extracted from Wikipedia.
pdf
bib
abs
Investigating Hierarchical Structure in Multi-Label Document Classification
Artemis Dampa
Effectively organizing the vast and ever-growing body of research in scientific literature is crucial to advancing the field and supporting scholarly discovery. In this paper, we study the task of fine-grained hierarchical multi-label classification of scholarly articles, using a structured taxonomy. Specifically, we investigate whether incorporating hierarchical information in a classification method can improve performance compared to conventional flat classification approaches. To this end, we suggest and evaluate different strategies for the classification, on three different axes: selection of positive and negative samples; soft-to-hard label mapping; hierarchical post-processing policies that utilize taxonomy-related requirements to update the final labeling. Experiments demonstrate that flat baselines constitute powerful baselines, but the infusion of hierarchical knowledge leads to better recall-focused performance based on use-case requirements.
pdf
bib
abs
Large Language Models for Lexical Resource Enhancement: Multiple Hypernymy Resolution in WordNet
Dimitar Hristov
Large language models (LLMs) have materially changed natural language processing (NLP). While LLMs have shifted focus from traditional semantic-based resources, structured linguistic databases such as WordNet remain essential for precise knowledge retrieval, decision making and aiding LLM development. WordNet organizes concepts through synonym sets (synsets) and semantic links but suffers from inconsistencies, including redundant or erroneous relations. This paper investigates an approach using LLMs to aid the refinement of structured language resources, specifically WordNet, by an automation for multiple hypernymy resolution, leveraging the LLMs semantic knowledge to produce tools for aiding and evaluating manual resource improvement.
pdf
bib
abs
Automated classification of causal relations. Evaluating different LLM performances.
Giacomo Magnifico
The search for formal causal relations in natural language faces inherent limitations due to the lack of mathematically and logically informed datasets. Thus, the exploration of causal relations in natural language leads to the analysis of formal-logic-adjacent language patterns. Thanks to the recent advancements of generative LLMs, this research niche is expanding within the field of natural language processing and evaluation. In this work, we conduct an evaluation of 9 models produced by different AI developing companies in order to answer the question “Are LLMs capable of discerning between different types of causal relations?”. The SciExpl dataset is chosen as a natural language corpus, and we develop three different prompt types aligned with zero-shot, few-shot, and chain-of-thought standards to evaluate the performance of the LLMs. Claude 3.7 Sonnet and Gemini 2.5 Flash Preview emerge as the best models for the task, with the respective highest F1 scores of 0.842 (few-shot prompting) and 0.846 (chain-of-thought prompting).
pdf
bib
abs
Study on Automatic Punctuation Restoration in Bilingual Broadcast Stream
Martin Polacek
In this study, we employ various ELECTRA-Small models that are pre-trained and fine-tuned on specific sets of languages for automatic punctuation restoration (APR) in automatically transcribed TV and radio shows, which contain conversations in two closely related languages. Our evaluation data specifically concerns bilingual interviews in Czech and Slovak and data containing speeches in Swedish and Norwegian. We train and evaluate three types of models: the multilingual (mELECTRA) model, which is pre-trained for 13 European languages; two bilingual models, each pre-trained for one language pair; and four monolingual models, each pre-trained for a single language. Our experimental results show that a) fine-tuning, which must be performed using data belonging to both target languages, is the key step in developing a bilingual APR system and b) the mELECTRA model yields competitive results, making it a viable option for bilingual APR and other multilingual applications. Thus, we publicly release our pre-trained bilingual and, in particular, multilingual ELECTRA-small models on HuggingFace, fostering further research in various multilingual tasks.
pdf
bib
abs
NoCs: A Non-Compound-Stable Splitter for German Compounds
Carmen Schacht
Compounding—the creation of highly complex lexical items through the combination of existing lexemes—can be considered one of the most efficient communication phenomenons, though the automatic processing of compound structures—especially of multi-constituent compounds—poses significant challenges for natural language processing. Existing tools like compound-split (Tuggener, 2016) perform well on compound head detection but are limited in handling long compounds and distinguishing compounds from non-compounds. This paper introduces NoCs (non-compound-stable splitter), a novel Python-based tool that extends the functionality of compound-split by incorporating recursive splitting, non-compound detection, and integration with state-of-the-art linguistic resources. NoCs employs a custom stack-and-buffer mechanism to traverse and decompose compounds robustly, even in cases involving multiple constituents. A large-scale evaluation using adapted GermaNet data shows that NoCs substantially outperforms compound-split in both non-compound identification and the recursive splitting of three- to five-constituent compounds, demonstrating its utility as a reliable resource for compound analysis in German.
pdf
bib
abs
A Proposal for Evaluating the Linguistic Quality of Synthetic Spanish Corpora
Lucia Sevilla-Requena
Large language models (LLMs) rely heavily on high-quality training data, yet human-generated corpora face increasing scarcity due to legal and practical constraints. Synthetic data generated by LLMs is emerging as a scalable alternative; however, concerns remain about its linguistic quality and diversity. While previous research has identified potential degradation in English synthetic corpora, the effects in Spanish, a language with distinct grammatical characteristics, remain underexplored. This research proposal aims to conduct a systematic linguistic evaluation of synthetic Spanish corpora generated by state-of-the-art LLMs, comparing them with human-written texts. The study will analyse three key dimensions: lexical, syntactic, and semantic diversity, using established corpus linguistics metrics. Through this comparative framework, the proposal intends to identify potential linguistic simplifications and degradation patterns in synthetic Spanish data. Ultimately, the proposed outcome is expected to contribute valuable insights to support the creation of robust and reliable Natural Language Processing (NLP) models for Spanish.
pdf
bib
abs
Personalizing chatbot communication with associative memory
Kirill Soloshenko
|
Alexandra Shatalina
|
Marina Sevostyanova
|
Elizaveta Kornilova
|
Konstantin Zaitsev
In our research paper we present the approach that is aimed at effectively expanding the context through integrating a database of associative memory into the pipeline. In order to improve long-term memory and personalization we have utilized methods close to Retrieval-Augmented Generation (RAG). Our method uses a multi-agent pipeline with a cold-start agent for initial interactions, a fact extraction agent to process user inputs, an associative memory agent for storing and retrieving context, and a generation agent for replying to user’s queries.Evaluation results show promising results: a 41% accuracy improvement over the base Gemma3 model (from 16% to 57%). Hence, with our approach, we demonstrate that personalized chatbots can bypass LLM memory limitations while increasing information reliability under the conditions of limited context and memory.
pdf
bib
abs
Visualization of LLM Annotated Documents
Teodor Todorov Valtchev
|
Nikolay Paev
The paper presents an automatic annotation and visualization system for documents in the field of Social Sciences and Humanities. The annotation is on two levels, named Entities and Events. The system combines automatically generated annotations from language models with a powerful text editor that is extended to accommodate manual annotation. The goal is to support the extraction of information from historical documents by scientists in the SS&H field. At the time of writing of the paper, the system is still in development.