Wesley Pompeu Carvalho
2025
Classifying Emotions in Tweets from the Financial Market: A BERT-based Approach
Wesley Pompeu Carvalho
|
Norton Trevisan Roman
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Behavioural finance emphasizes the relevance of investor sentiment and emotions in the pricing of financial assets. However, little research has examined how discrete emotions can be detected in text related to this domain, with extant work focusing mostly in sentiment instead. This study approaches this problem by describing a framework for emotion classification in tweets related to the stock market, written in Brazilian Portuguese. Emotion classifiers were then developed, based on Plutchik’s psychoevolutionary theory, by fine-tuning BERTimbau, a pre-trained BERT-based language model for Brazilian Portuguese, and applying it to an existing corpus of tweets, from the stock market domain, previously annotated with emotions. Each of Plutchik’s four emotional axes was modelled as a ternary classification problem. For each axis, 30 independent training iterations were executed using a repeated holdout strategy with different train/test splits in each iteration. In every iteration, hyperparameter tuning was performed via 10-fold stratified cross-validation on the training set to identify the best configuration. A final model was then retrained using the selected hyperparameters and evaluated on a hold-out test set, generating a distribution of macro-F1 scores in out-of-sample data. The results demonstrated statistically significant improvements over a stratified random baseline (Welch’s t-test, << 0.001 across all axes), with macro-F1 scores ranging from 0.50 to 0.61. These findings point to the feasibility of using transformer-based models to capture emotional nuance in financial texts written in Portuguese and provide a reproducible framework for future research.