Vinclat: Evaluating Reasoning, Cognition and Culture in One Game

Marc Pàmies; Javier Aula-Blasco; Aitor González-Agirre; Marta Villegas

Vinclat: Evaluating Reasoning, Cognition and Culture in One Game

Marc Pàmies, Javier Aula-Blasco, Aitor Gonzalez-Agirre, Marta Villegas

Abstract

This paper introduces Vinclat, a novel evaluation dataset for Catalan carefully designed to assess the reasoning capabilities and cultural knowledge of LLMs. It comprises 1,000 high-quality instances, meticulously crafted and reviewed by human annotators. Each instance presents a complex riddle that requires a two-step reasoning process involving inferential and abductive reasoning, along with other cognitive skills such as lexical retrieval, paraphrasing, flexibility in interpretation, pattern recognition, and associative thinking. Given four independent clues, models should infer intermediate concepts which, despite being seemingly unrelated, can be creatively connected to reach a final solution. The task targets a unique blend of capabilities, distinguishing it from existing NLP benchmarks. Our evaluation of state-of-the-art models reveals that these still fall significantly short of human-level reasoning, although scaling trends suggest that the performance gap may narrow over time. This indicates that Vinclat provides a robust and long-term challenge, resisting the rapid saturation that is commonly observed in many existing evaluation datasets.

Anthology ID:: 2026.mme-main.4
Volume:: Proceedings of the First Workshop on Multilingual Multicultural Evaluation
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Pinzhen Chen, Vilém Zouhar, Hanxu Hu, Simran Khanuja, Wenhao Zhu, Barry Haddow, Alexandra Birch, Alham Fikri Aji, Rico Sennrich, Sara Hooker
Venues:: MME | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 49–66
Language:
URL:: https://aclanthology.org/2026.mme-main.4/
DOI:
Bibkey:
Cite (ACL):: Marc Pàmies, Javier Aula-Blasco, Aitor Gonzalez-Agirre, and Marta Villegas. 2026. Vinclat: Evaluating Reasoning, Cognition and Culture in One Game. In Proceedings of the First Workshop on Multilingual Multicultural Evaluation, pages 49–66, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Vinclat: Evaluating Reasoning, Cognition and Culture in One Game (Pàmies et al., MME 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.mme-main.4.pdf

PDF Cite Search Fix data