Textual Inference in Portuguese: Comparing Language Models

Fabiana Avais, Valeria de Paiva, Livy Real


Abstract
Large language models (LLMs) are increasingly used for Natural Language Inference (NLI), yet their ability to perform logic-sensitive semantic reasoning, especially outside English, remains underexplored. This paper presents a preliminary investigation into the feasibility and usefulness of developing FraCaS-BR, a Portuguese adaptation of the FraCaS benchmark for semantic inference. Using a small diagnostic subset of seven FraCaS problems focusing on generalized quantifiers, plurals, and nominal anaphora, we evaluate the behavior of three LLMs (ChatGPT, Maritalk, and Evaristo) on Brazilian Portuguese translations. Each problem is submitted multiple times to assess correctness, variance, and consistency relative to the original FraCaS gold labels. The results reveal systematic differences across models.While ChatGPT shows higher overall correctness and stability, all models exhibit limitations that undermine their reliability on logic-controlled inference tasks. The extent of manual correction required during translation further underscores the necessity of human-in-the-loop evaluation. Taken together, these findings support and motivate the development of FraCaS-BR as a controlled evaluation resource for assessing semantic reasoning in Portuguese.
Anthology ID:
2026.propor-2.28
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
201–209
Language:
URL:
https://aclanthology.org/2026.propor-2.28/
DOI:
Bibkey:
Cite (ACL):
Fabiana Avais, Valeria de Paiva, and Livy Real. 2026. Textual Inference in Portuguese: Comparing Language Models. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2, pages 201–209, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
Textual Inference in Portuguese: Comparing Language Models (Avais et al., PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-2.28.pdf