Daniela Barreiro Claro
2026
AttentionApp: An Interactive Tool for Analyzing Transformer Attention Patterns in Portuguese
Ricardo G. Oliveira | Daniela Barreiro Claro
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Ricardo G. Oliveira | Daniela Barreiro Claro
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
This paper presents AttentionApp, an interactive demonstration system designed to support the inspection and linguistic analysis of attention mechanisms in Transformer-based language models for Portuguese. The tool allows users to input sentences in Portuguese and visualize attention distributions across layers and heads, enabling fine-grained qualitative analysis of syntactic and semantic patterns captured by the model. AttentionApp is intended as a research-oriented tool, facilitating exploratory analysis, hypothesis generation, and interpretability studies for Portuguese Natural Language Processing.
dialect2vec: Um método baseado em vetores para transcrição dialetal do português a partir de questionários do ALiB
Laila Mota | Daniela Barreiro Claro | Eloize R. Marques Seno | Rerisson Cavalcante de Araújo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Laila Mota | Daniela Barreiro Claro | Eloize R. Marques Seno | Rerisson Cavalcante de Araújo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
A modelagem da variação dialetal enfrenta desafios quando dependente de modelos de linguagem baseados em sub-palavras, que frequentemente falham ao processar a complexidade de transcrições fonéticas devido a restrições de vocabulário e vieses semânticos. Este trabalho introduz o dialect2vec, um método para capturar a diversidade dialetal do Português Brasileiro. Nossa proposta adota o modelo token-free ByT5 para codificar sequências do Alfabeto Fonético Internacional (IPA) ao nível de byte, mitigando a perda de informação causada por tokens desconhecidos. Os experimentos foram realizados com dados do Atlas Linguístico do Brasil (ALiB), em que a dimensão fonética isolada demonstrou viabilidade em tarefas de agrupamento não supervisionado, com desempenho próximo do estado da arte léxico (BERTimbau), comprovando que arquiteturas baseadas em bytes podem recuperar estruturas dialetais complexas exclusivamente através de pistas fonológicas, oferecendo um mapeamento mais granular das fronteiras linguísticas.
Analysis of Machine Translators on Sentences Generated by Portuguese Image Captioning Models
Natan Moura | João Medrado Gondim | Daniela Barreiro Claro | Babacar Mane
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Natan Moura | João Medrado Gondim | Daniela Barreiro Claro | Babacar Mane
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Recent works in the fields of computer vision and natural language processing have enabled the recognition and identification of objects in images, generating automatic descriptions. Despite these advancements, the main research in this field is primarily related to the English language, requiring some adaptation when dealing with other languages, such as Portuguese. One of these methods is the translate-train approach, which involves translating the training dataset into the desired language. However, there are various translators with different levels of effectiveness available. The primary objective of this work is to evaluate the behavior of image captioning models when trained on datasets translated into Portuguese by different automatic translators, both quantitatively (cost, training time, metrics on the test set) and qualitatively (comparative evaluation form, error analysis). The results indicate that it is possible to obtain valid automatic descriptions in Portuguese from image captioning models trained on translated datasets, and that more robust translators produce more meaningful descriptions.
2024
Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology
Daniela Barreiro Claro | Adriana Pagano
Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology
Daniela Barreiro Claro | Adriana Pagano
Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology
Brazilian Consumer Protection Code: a methodology for a dataset to Question-Answer (QA) Models
Aline Athaydes | Lucas Bulcao | Caio Sacramento | Babacar Mane | Daniela Barreiro Claro | Marlo Souza | Robespierre Pita
Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology
Aline Athaydes | Lucas Bulcao | Caio Sacramento | Babacar Mane | Daniela Barreiro Claro | Marlo Souza | Robespierre Pita
Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology
2023
TransAlign: traduão e alinhamento de corpora para a lingua portuguesa
Alan Melo | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
Alan Melo | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
Desambiguação dos termos do Atlas Linguistico do Brasil atraves da OpenWordnet-PT-ALiB
Augusto Barreto | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
Augusto Barreto | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
Desafios da tarefa de Extração de Informação Aberta: uma abordagem metodologica de um corpus automatizado ate o corpus manual
Beatriz Queiroz | Rerisson Cavalcante | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
Beatriz Queiroz | Rerisson Cavalcante | Daniela Barreiro Claro
Proceedings of the 14th Brazilian Symposium in Information and Human Language Technology
2017
Utilizando Features Linguísticas Genéricas para Classificação de Triplas Relacionais em Português (Generic Linguistic Features for Relational Triples Classification in Portuguese)[In Portuguese]
George Barbosa | Daniela Barreiro Claro
Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology
George Barbosa | Daniela Barreiro Claro
Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology