Tiago de Melo
Also published as: Tiago de Melo
2026
Discovery of Legal Patterns in Civil Petitions via LLM-Based Fact Extraction and Density Clustering
Rhedson Esashika | Carlos M. S. Figueiredo | Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Rhedson Esashika | Carlos M. S. Figueiredo | Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
The analysis of unstructured civil petitions is often hindered by procedural noise and verbose argumentation. To address this, we propose a pipeline composed of LLM-based fact extraction followed by legal-domain embeddings of texts for unsupervised density clustering. We employ Large Language Models to isolate factual narratives from raw texts, which are then encoded using domain-specific representations (Legal-BERT) and grouped via UMAP dimensionality reduction and the HDBSCAN algorithm. Comparative experiments on a Brazilian judicial corpus reveal that clustering based solely on extracted yields significantly more cohesive and semantically well-defined groups than, which suffer from fragmentation due to content variability. Results indicate that the proposed method is a promising approach for thematic organization, procedural triage support, and large-scale discovery of legal patterns.
Gendered Stylistic Variation in Brazilian Portuguese Google Play Reviews: A Large-Scale Study
Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
We study gender-associated stylistic variation in Brazilian Portuguese Google Play reviews. Using IBGE name frequencies, we infer binary gender from first names in 76.7M reviews (96 apps, 2011–2025), obtaining 22.25M high-confidence labels. Women-associated reviews show markedly higher paralinguistic expressivity (about 60% higher emoji density and more lengthening/punctuation), while lexical diversity (MTLD) is nearly identical across groups. Ratings are mostly positive, with men contributing relatively more 1-star reviews and women more 5-star reviews. These findings contribute to a deeper understanding of digital sociolinguistic behavior within the Brazilian context. We discuss limitations of name-based gender inference and future demographic extensions.
Gender Identification in Brazilian Portuguese Product Reviews: A Comparative Study of Classical Models, BERT, and LLMs
Tiago de Melo | Carlos M. S. Figueiredo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Tiago de Melo | Carlos M. S. Figueiredo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
This study analyzes gender identification in Brazilian Portuguese using Amazon reviews drawn from ten product categories. Nine models were evaluated: three classical classifiers (Logistic Regression, Random Forest, and SVM), a multilingual BERT, and five LLMs (ChatGPT 4o, ChatGPT 3.5, DeepSeek, Sabia3, and Sabiazinho). Experiments show that BERT achieved the best performance (macro-F1 = 0.634), outperforming ChatGPT 4o and Logistic Regression by less than one percentage point. Reviews authored by women reach an average F1 of 0.654—four points higher than those by men. Performance also varies by domain: books and automotive are easier, whereas baby and pets are more challenging.
Rating–Text Mismatch in Brazilian Portuguese Reviews: How Reliable Are Zero-Shot LLMs?
Emanuelle Marreira | Carlos M. S. Figueiredo | Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Emanuelle Marreira | Carlos M. S. Figueiredo | Tiago de Melo
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
This study evaluates the ability of large language models (LLMs) to detect incoherence between the text of product reviews and their assigned rating (1 or 5 stars). Using popular LLMs such as GPT-5, Llama-4 and DeepSeek-3.2, and models optimized for Brazilian Portuguese, Sabiá-3.1 and Bode-3.1, we show that some are capable of detecting incoherence among texts and ratings (F1 > 90%) in a zero-shot protocol. Models also present a high agreement in the predictions, where several prediction rounds led to low variability (Fleiss’ κ> 0.95). With the demonstrated incoherence present in all product categories (aprox. 10% of comments), the results suggest that LLMs are very promising to perform this high semantic interpretation task, and they can be used as valuable tools for online monitoring and recommendation systems.
2024
Is ChatGPT an effective solver of sentiment analysis tasks in Portuguese? A Preliminary Study
Gladson de Araujo | Tiago de Melo | Carlos Maurício S. Figueiredo
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
Gladson de Araujo | Tiago de Melo | Carlos Maurício S. Figueiredo
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
2020
BabelEnconding at SemEval-2020 Task 3: Contextual Similarity as a Combination of Multilingualism and Language Models
Lucas Rafael Costella Pessutto | Tiago de Melo | Viviane P. Moreira | Altigran da Silva
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Lucas Rafael Costella Pessutto | Tiago de Melo | Viviane P. Moreira | Altigran da Silva
Proceedings of the Fourteenth Workshop on Semantic Evaluation
This paper describes the system submitted by our team (BabelEnconding) to SemEval-2020 Task 3: Predicting the Graded Effect of Context in Word Similarity. We propose an approach that relies on translation and multilingual language models in order to compute the contextual similarity between pairs of words. Our hypothesis is that evidence from additional languages can leverage the correlation with the human generated scores. BabelEnconding was applied to both subtasks and ranked among the top-3 in six out of eight task/language combinations and was the highest scoring system three times.