Catarina Silva

2026

Unsupervised Evaluation of Explanations for Hate Speech Classification in Portuguese
Isabel Carvalho | Hugo Gonçalo Oliveira | Catarina Silva
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Top-performing Artificial Intelligence models often operate as black boxes. Explainable AI (XAI) can increase transparency, but its evaluation is currently hindered by a lack of annotated explanation data and agreed-upon validation standards. We propose a framework for evaluating the faithfulness of explanations in Portuguese hate speech detection. Our approach is based on the premise that a faithful explanation should identify features whose removal degrades a model’s performance. We follow a three-step process: (i) prediction on the original input; (ii) identification and removal of explanatory keywords; and (iii), prediction on the modified input, with performance differences used as an evaluation signal. We conduct experiments using ensemble classifiers, multiple keyword selection strategies, and SHAP and LIME as XAI methods. In addition, Large Language Models (LLMs) are explored both as classifiers and as explainers. Results demonstrate that removing explanatory keywords degrades model performance more than random word removal, indicating explanation faithfulness. Notably, SHAP and LIME consistently provided more faithful explanations than LLM-generated or manual alternatives, although impact depends on the keyword selection strategy. These findings highlight the importance of standardised, unsupervised evaluation protocols for XAI and the faithfulness limitations of current generative LLM explanations.

pdf bib abs

FlowDisco: Interactive Exploration of Dialogue Flows
Patrícia Ferreira | Carolina Loureiro | Ana Alves | Catarina Silva | Hugo Gonçalo Oliveira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2

Analyzing large conversational datasets is often inefficient due to the linear nature of text, which hinders the tracking of interaction evolution over time. To address this, we present FlowDisco, an interactive platform for the automatic discovery and exploration of dialogue flows. The framework uses semantic embeddings and modular clustering to transform raw text into probabilistic dialogue flows. By providing a web interface with dynamic filtering and a suite of analytical metrics, FlowDisco simplifies the visual identification and validation of conversational behaviors at scale. The platform’s utility is demonstrated through real-world application scenarios, including customer support interactions and multi-party political debates, where it successfully uncovers complex patterns and sentiment shifts that traditional sequential analysis often overlooks.

pdf bib abs

Prompt Engineering for Named Entity Extraction from Portuguese Legal Documents
Giovanni Maffeo | Catarina Silva | Hugo Gonçalo Oliveira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

The growing volume and complexity of legal texts highlight the need for automatic methods capable of extracting structured information from unstructured documents. Motivated by the limited availability and high cost of annotated legal data, this challenge is even more severe for the Portuguese language. This work investigates whether prompt engineering over Large Language Models (LLMs) can effectively support legal Named Entity Recognition (NER) in low-supervision and low-resource settings through In-Context Learning (ICL). Using the LeNER-Br corpus, we evaluate category-specific prompts, different chunking sizes, and prompt engineering strategies. Entity-level evaluation using Exact Match Micro F1 shows that prompt engineering has a stronger impact on performance than other strategies. The best results were obtained with larger models, the 4-bit quantised Qwen-2.5:32B and GPT-5.2, achieving scores of 57.9% and 71.9%, respectively, highlighting the potential of this approach as an alternative to traditional supervised NER pipelines.

pdf bib abs

Analyzing Debate Dynamics in the Portuguese Parliament with Dialogue Action Flows
Patrícia Ferreira | Ana Alves | Catarina Silva | Hugo Gonçalo Oliveira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Analyzing how large-scale multi-party dialogues shape collective behavior is a central challenge in computational linguistics. However, traditional text-based methods often overlook the complex, non-linear turn-taking dynamics defining these interactions. To address this gap, we propose a framework based on Dialogue Action Flows (DAFs) that integrates verbal utterances and non-verbal actions into a unified probabilistic representation of interactional behavior. Interactions are encoded as speaker-action states, forming a probabilistic DAF that reveals dominant behavioral trajectories and recurrent patterns. We validate this framework on five years of Portuguese Parliament debates. Analysis reveals systematic behavioral asymmetries driven by party roles: while government parties exhibit increasing alignment, opposition forces, particularly the radical wing, maintain persistently high conflict. Additionally, the rising volume of interactions across legislative years indicates a progressively heated environment. Overall, our framework provides a quantitative and interpretable approach for modeling polarization, alignment, and interactional dynamics in multi-party political discourse.

2025

pdf bib abs

Cognitive Flow: An LLM-Automated Framework for Quantifying Reasoning Distillation
José Matos | Catarina Silva | Hugo Goncalo Oliveira
Proceedings of the 18th International Natural Language Generation Conference

The ability of large language models (LLMs) to reason effectively is crucial for a wide range of applications, from complex decision-making to scientific research. However, it remains unclear how well reasoning capabilities are transferred or preserved when LLMs undergo Knowledge Distillation (KD), a process that typically reduces model size while attempting to retain performance. In this study, we explore the effects of model distillation on the reasoning abilities of various reasoning language models (RLMs). We introduce Cognitive Flow, a novel framework that systematically extracts meaning and map states in Chain-of-Thought (CoT) processes, offering new insights on model reasoning and enabling quantitative comparisons across RLMs. Using this framework, we investigate the impact of KD on CoTs produced by RLMs. We target DeepSeek-R1-671B and its distilled 70B, 32B and 14B versions, as well as QwenQwQ-32B from the Qwen series. We evaluate the models on three subsets of mathematical reasoning tasks with varying complexity from the MMLU benchmark. Our findings demonstrate that while distillation can effectively replicate a similar reasoning style under specific conditions, it struggles with simpler problems, revealing a significant divergence in the observable thought process and a potential limitation in the transfer of a robust and adaptable problem-solving capability.

2024

pdf bib abs

Sentiment-Aware Dialogue Flow Discovery for Interpreting Communication Trends
Patrícia Ferreira | Isabel Carvalho | Ana Alves | Catarina Silva | Hugo Gonçalo Oliveira
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Customer-support services increasingly rely on automation, whether fully or with human intervention. Despite optimising resources, this may result in mechanical protocols and lack of human interaction, thus reducing customer loyalty. Our goal is to enhance interpretability and provide guidance in communication through novel tools for easier analysis of message trends and sentiment variations. Monitoring these contributes to more informed decision-making, enabling proactive mitigation of potential issues, such as protocol deviations or customer dissatisfaction. We propose a generic approach for dialogue flow discovery that leverages clustering techniques to identify dialogue states, represented by related utterances. State transitions are further analyzed to detect prevailing sentiments. Hence, we discover sentiment-aware dialogue flows that offer an interpretability layer to artificial agents, even those based on black-boxes, ultimately increasing trustworthiness. Experimental results demonstrate the effectiveness of our approach across different dialogue datasets, covering both human-human and human-machine exchanges, applicable in task-oriented contexts but also to social media, highlighting its potential impact across various customer-support settings.

pdf bib

Question Answering for Dialogue State Tracking in Portuguese
Francisco Pais | Patricia Ferreira | Catarina Silva | Ana Alves | Hugo Gonçalo Oliveira
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1

pdf bib abs

Towards Automated Evaluation of Knowledge Encoded in Large Language Models
Bruno Carlos Luís Ferreira | Catarina Silva | Hugo Gonçalo Oliveira
Proceedings of the Workshop on Deep Learning and Linked Data (DLnLD) @ LREC-COLING 2024

Large Language Models (LLMs) have a significant user base and are gaining increasing interest and impact across various domains. Given their expanding influence, it is crucial to implement appropriate guardrails or controls to ensure ethical and responsible use. In this paper, we propose to automate the evaluation of the knowledge stored in LLMs. This is achieved by generating datasets tailored for this specific purpose, in any selected domain. Our approach consists of four major steps: (i) extraction of relevant entities; (ii) gathering of domain properties; (iii) dataset generation; and (iv) model evaluation. In order to materialize this vision, tools and resources were experimented for entity linking, knowledge acquisition, classification and prompt generation, yielding valuable insights and lessons. The generation of datasets for domain specific model evaluation has successfully proved that the approach can be a future tool for evaluating and moving LLMs “black-boxes” to human-interpretable knowledge bases.

2022

pdf bib abs

A Brief Survey of Textual Dialogue Corpora
Hugo Gonçalo Oliveira | Patrícia Ferreira | Daniel Martins | Catarina Silva | Ana Alves
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Several dialogue corpora are currently available for research purposes, but they still fall short for the growing interest in the development of dialogue systems with their own specific requirements. In order to help those requiring such a corpus, this paper surveys a range of available options, in terms of aspects like speakers, size, languages, collection, annotations, and domains. Some trends are identified and possible approaches for the creation of new corpora are also discussed.

Catarina Silva

2026

2025

2024

2022

2019

Co-authors

Venues