Ivan Sekulić

Also published as: Ivan Sekulic

2025

Efficient Out-of-Scope Detection in Dialogue Systems via Uncertainty-Driven LLM Routing
Álvaro Zaera | Diana Nicoleta Popa | Ivan Sekulic | Paolo Rosso
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

Out-of-scope (OOS) intent detection is a critical challenge in task-oriented dialogue systems (TODS), as it ensures robustness to unseen and ambiguous queries. In this work, we propose a novel but simple modular framework that combines uncertainty modeling with fine-tuned large language models (LLMs) for efficient and accurate OOS detection. The first step applies uncertainty estimation to the output of an in-scope intent detection classifier, which is currently deployed in a real-world TODS handling tens of thousands of user interactions daily. The second step then leverages an emerging LLM-based approach, where a fine-tuned LLM is triggered to make a final decision on instances with high uncertainty.Unlike prior approaches, our method effectively balances computational efficiency and performance, combining traditional approaches with LLMs and yielding state-of-the-art results on key OOS detection benchmarks, including real-world OOS data acquired from a deployed TODS.

pdf bib abs

Detecting user frustration in modern-day task-oriented dialog (TOD) systems is imperative for maintaining overall user satisfaction, engagement, and retention. However, most recent research is focused on sentiment and emotion detection in academic settings, thus failing to fully encapsulate implications of real-world user data. To mitigate this gap, in this work, we focus on user frustration in a deployed TOD system, assessing the feasibility of out-of-the-box solutions for user frustration detection. Specifically, we compare the performance of our deployed keyword-based approach, open-source approaches to sentiment analysis, dialog breakdown detection methods, and emerging in-context learning LLM-based detection. Our analysis highlights the limitations of open-source methods for real-world frustration detection, while demonstrating the superior performance of the LLM-based approach, achieving a 16% relative improvement in F1 score on an internal benchmark. Finally, we analyze advantages and limitations of our methods and provide an insight into user frustration detection task for industry practitioners.

2024

pdf bib abs

In the realm of dialogue systems, user simulation techniques have emerged as a game-changer, redefining the evaluation and enhancement of task-oriented dialogue (TOD) systems. These methods are crucial for replicating real user interactions, enabling applications like synthetic data augmentation, error detection, and robust evaluation. However, existing approaches often rely on rigid rule-based methods or on annotated data. This paper introduces DAUS, a Domain-Aware User Simulator. Leveraging large language models, we fine-tune DAUS on real examples of task-oriented dialogues. Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment. Notably, we have observed that fine-tuning enhances the simulator’s coherence with user goals, effectively mitigating hallucinations—a major source of inconsistencies in simulator responses.

2020

pdf bib abs

Reasoning with Latent Structure Refinement for Document-Level Relation Extraction
Guoshun Nan | Zhijiang Guo | Ivan Sekulic | Wei Lu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Document-level relation extraction requires integrating information within and across multiple sentences of a document and capturing complex interactions between inter-sentence entities. However, effective aggregation of relevant information in the document remains a challenging research question. Existing approaches construct static document-level graphs based on syntactic trees, co-references or heuristics from the unstructured text to model the dependencies. Unlike previous methods that may not be able to capture rich non-local interactions for inference, we propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph. We further develop a refinement strategy, which enables the model to incrementally aggregate relevant information for multi-hop reasoning. Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED), significantly improving over the previous results, and also yields new state-of-the-art results on the CDR and GDA dataset. Furthermore, extensive analyses show that the model is able to discover more accurate inter-sentence relations.

2019

pdf bib abs

Adapting Deep Learning Methods for Mental Health Prediction on Social Media
Ivan Sekulic | Michael Strube
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Mental health poses a significant challenge for an individual’s well-being. Text analysis of rich resources, like social media, can contribute to deeper understanding of illnesses and provide means for their early detection. We tackle a challenge of detecting social media users’ mental status through deep learning-based models, moving away from traditional approaches to the task. In a binary classification task on predicting if a user suffers from one of nine different disorders, a hierarchical attention network outperforms previously set benchmarks for four of the disorders. Furthermore, we explore the limitations of our model and analyze phrases relevant for classification by inspecting the model’s word-level attention weights.

2018

pdf bib abs

Not Just Depressed: Bipolar Disorder Prediction on Reddit
Ivan Sekulic | Matej Gjurković | Jan Šnajder
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Bipolar disorder, an illness characterized by manic and depressive episodes, affects more than 60 million people worldwide. We present a preliminary study on bipolar disorder prediction from user-generated text on Reddit, which relies on users’ self-reported labels. Our benchmark classifiers for bipolar disorder prediction outperform the baselines and reach accuracy and F1-scores of above 86%. Feature analysis shows interesting differences in language use between users with bipolar disorders and the control group, including differences in the use of emotion-expressive words.

2016

pdf bib abs

VerbCROcean: A Repository of Fine-Grained Semantic Verb Relations for Croatian
Ivan Sekulić | Jan Šnajder
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we describe VerbCROcean, a broad-coverage repository of fine-grained semantic relations between Croatian verbs. Adopting the methodology of Chklovski and Pantel (2004) used for acquiring the English VerbOcean, we first acquire semantically related verb pairs from a web corpus hrWaC by relying on distributional similarity of subject-verb-object paths in the dependency trees. We then classify the semantic relations between each pair of verbs as similarity, intensity, antonymy, or happens-before, using a number of manually-constructed lexico-syntatic patterns. We evaluate the quality of the resulting resource on a manually annotated sample of 1000 semantic verb relations. The evaluation revealed that the predictions are most accurate for the similarity relation, and least accurate for the intensity relation. We make available two variants of VerbCROcean: a coverage-oriented version, containing about 36k verb pairs at a precision of 41%, and a precision-oriented version containing about 5k verb pairs, at a precision of 56%.

pdf bib

Venues

WASSA1

WNUT1

WS1

Fix author

Ivan Sekulić

2025

2024

2020

2019

2018

2016

Co-authors

Venues