Uriel Anderson Lasheras
2025
CaLQuest.PT: Towards the Collection and Evaluation of Natural Causal Ladder Questions in Portuguese for AI Agents
Uriel Anderson Lasheras
|
Vladia Pinheiro
Proceedings of the First Workshop on Language Models for Low-Resource Languages
Large Language Models (LLMs) are increasingly central to the development of generative AI across diverse fields. While some anticipate these models may mark a step toward artificial general intelligence, their ability to handle complex causal reasoning remains unproven. Causal reasoning, particularly at Pearl’s interventional and counterfactual levels, is essential for true general intelligence. In this work, we introduce CaLQuest.PT, a dataset of over 8,000 natural causal questions in Portuguese, collected from real human interactions. Built upon a novel three-axis taxonomy, CaLQuest.PT categorizes questions by causal intent, action requirements, and the level of causal reasoning needed (associational, interventional, or counterfactual). Our findings from evaluating CaLQuest.PT’s seed questions with GPT-4o reveal that this LLM face challenges in handling interventional and relation-seeking causal queries. These results suggest limitations in using GPT-4o for extending causal question annotations and highlight the need for improved LLM strategies in causal reasoning. CaLQuest.PT provides a foundation for advancing LLM capabilities in causal understanding, particularly for the Portuguese-speaking world.