2025
pdf
bib
abs
Fine-Tuning Medium-Scale LLMs for Joint Intent Classification and Slot Filling: A Data-Efficient and Cost-Effective Solution for SMEs
Maia Aguirre
|
Ariane Méndez
|
Arantza del Pozo
|
Maria Ines Torres
|
Manuel Torralbo
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Dialogue Systems (DS) are increasingly in demand for automating tasks through natural language interactions. However, the core techniques for user comprehension in DS depend heavily on large amounts of labeled data, limiting their applicability in data-scarce environments common to many companies. This paper identifies best practices for data-efficient development and cost-effective deployment of DS in real-world application scenarios. We evaluate whether fine-tuning a medium-sized Large Language Model (LLM) for joint Intent Classification (IC) and Slot Filling (SF), with moderate hardware resource requirements still affordable by SMEs, can achieve competitive performance using less data compared to current state-of-the-art models. Experiments on the Spanish and English portions of the MASSIVE corpus demonstrate that the Llama-3-8B-Instruct model fine-tuned with only 10% of the data outperforms the JointBERT architecture and GPT-4o in a zero-shot prompting setup in monolingual settings. In cross-lingual scenarios, Llama-3-8B-Instruct drastically outperforms multilingual JointBERT demonstrating a vastly superior performance when fine-tuned in a language and evaluated in the other.
2024
pdf
bib
Speech Emotion Recognition for Call Centers using Self-supervised Models: A Complete Pipeline for Industrial Applications
Juan M. Martín-Doñas
|
Asier López Zorrilla
|
Mikel deVelasco
|
Juan Camilo Vasquez-Correa
|
Aitor Álvarez
|
Maria Inés Torres
|
Paz Delgado
|
Ane Lazpiur
|
Blanca Romero
|
Irati Alkorta
Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)
pdf
bib
abs
Knowledge-Grounded Dialogue Act Transfer using Prompt-Based Learning for Controllable Open-Domain NLG
Alain Vazquez Risco
|
Angela Maria Ramirez
|
Neha Pullabhotla
|
Nan Qiang
|
Haoran Zhang
|
Marilyn Walker
|
Maria Ines Torres
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Open domain spoken dialogue systems need to controllably generate many different dialogue acts (DAs) to allow Natural Language Generation (NLG) to create interesting and engaging conversational interactions with users. We aim to create an NLG engine that can produce a variety of DAs that make substantive knowledge-grounded contributions to a conversation. Training such an NLG typically requires dialogue corpora that are labelled for DAs, which are expensive to produce and vulnerable to quality issues. Here, we present a prompt-based learning approach to transfer DAs from one domain, video games, to 7 new domains. For each novel domain, we first crawl WikiData to create Meaning Representations that systematically vary both the number of attributes and hops on the WikiData Knowledge Graph. The proposed method involves a self-training step to create prompt examples for each domain followed by an overgeneration and ranking step. The result is a novel, high-quality dataset, Wiki-Dialogue, of 71K knowledge-grounded utterances, covering 9 DAs and the Art, Movies, Music, Sports, TV, Animal, and Boardgames domains, whose combined DA and semantic accuracy is 89%. We assess the corpus quality using both automatic and human evaluations and find it high. The corpus is found to be safe, lexically rich, and large in vocabulary, when compared to similar datasets.
pdf
bib
abs
Incremental Learning for Knowledge-Grounded Dialogue Systems in Industrial Scenarios
Izaskun Fernandez
|
Cristina Aceta
|
Cristina Fernandez
|
Maria Ines Torres
|
Aitor Etxalar
|
Ariane Mendez
|
Maia Agirre
|
Manuel Torralbo
|
Arantza Del Pozo
|
Joseba Agirre
|
Egoitz Artetxe
|
Iker Altuna
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
In today’s industrial landscape, seamless collaboration between humans and machines is essential and requires a shared knowledge of the operational domain. In this framework, the technical knowledge for operator assistance has traditionally been derived from static sources such as technical documents. However, experienced operators hold invaluable know-how that can significantly contribute to support other operators. This work focuses on enhancing the operator assistance tasks in the manufacturing industry by leveraging spoken natural language interaction. More specifically, a Human-in-the-Loop (HIL) incremental learning approach is proposed to integrate this expertise into a domain knowledge graph (KG) dynamically, along with the use of in-context learning for Large Language Models (LLMs) to benefit other capabilities of the system. Preliminary results of the experimentation carried out in an industrial scenario, where the graph size was increased in a 25%, demonstrate that the incremental enhancing of the KG benefits the dialogue system’s performance.
2023
pdf
bib
Compiling a Corpus of Technical Documents for Dialogue System Development in the Industrial Sector
Laura García-Sardiña
|
Eneko Ruiz
|
Cristina Aceta
|
Izaskun Fernández
|
Maria Inés Torres
|
Arantza del Pozo
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)