Javier Gonzalez
2026
Reasoning Beyond Labels: Measuring LLM Sentiment in Low-Resource, Culturally Nuanced Contexts
Millicent Ochieng | Anja Thieme | Ignatius Ezeani | Risa Ueno | Samuel Chege Maina | Keshet Ronen | Javier Gonzalez | Jacki O'Neill
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Millicent Ochieng | Anja Thieme | Ignatius Ezeani | Risa Ueno | Samuel Chege Maina | Keshet Ronen | Javier Gonzalez | Jacki O'Neill
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Sentiment analysis in low-resource, culturally nuanced contexts challenges conventional NLP approaches that assume fixed labels and universal affective expressions. We present a diagnostic framework that treats sentiment as a context-dependent, culturally embedded construct, and evaluate how large language models (LLMs) reason about sentiment in informal, code-mixed WhatsApp messages from Nairobi youth health groups. Using human-annotated data, sentiment-flipped counterfactuals, and rubric-based explanation evaluation, we probe LLM interpretability, robustness, and alignment with human reasoning. Framing our evaluation through a social science measurement lens, we operationalize LLM outputs as an instrument for measuring the abstract concept of sentiment. Our findings reveal significant variation in model reasoning quality, with top-tier LLMs demonstrating greater interpretive stability, while smaller open-weight models in our study show reduced stability under ambiguity or sentiment shifts. This work highlights the need for culturally sensitive, reasoning-aware AI evaluation in complex, real-world communication.
2016
Some strategies for the improvement of a Spanish WordNet
Matias Herrera | Javier Gonzalez | Luis Chiruzzo | Dina Wonsever
Proceedings of the 8th Global WordNet Conference (GWC)
Matias Herrera | Javier Gonzalez | Luis Chiruzzo | Dina Wonsever
Proceedings of the 8th Global WordNet Conference (GWC)
Although there are currently several versions of Princeton WordNet for different languages, the lack of development of some of these versions does not make it possible to use them in different Natural Language Processing applications. So is the case of the Spanish Wordnet contained in the Multilingual Central Repository (MCR), which we tried unsuccessfully to incorporate into an anaphora resolution application and also in search terms expansion. In this situation, different strategies to improve MCR Spanish WordNet coverage were put forward and tested, obtaining encouraging results. A specific process was conducted to increase the number of adverbs, and a few simple processes were applied which made it possible to increase, at a very low cost, the number of terms in the Spanish WordNet. Finally, a more complex method based on distributional semantics was proposed, using the relations between English Wordnet synsets, also returning positive results.