Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution

Denis Emelin; Rico Sennrich

doi:10.18653/v1/2021.emnlp-main.670

Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution

Abstract

Winograd schemas are a well-established tool for evaluating coreference resolution (CoR) and commonsense reasoning (CSR) capabilities of computational models. So far, schemas remained largely confined to English, limiting their utility in multilingual settings. This work presents Wino-X, a parallel dataset of German, French, and Russian schemas, aligned with their English counterparts. We use this resource to investigate whether neural machine translation (NMT) models can perform CoR that requires commonsense knowledge and whether multilingual language models (MLLMs) are capable of CSR across multiple languages. Our findings show Wino-X to be exceptionally challenging for NMT systems that are prone to undesirable biases and unable to detect disambiguating information. We quantify biases using established statistical methods and define ways to address both of these issues. We furthermore present evidence of active cross-lingual knowledge transfer in MLLMs, whereby fine-tuning models on English schemas yields CSR improvements in other languages.

Anthology ID:: 2021.emnlp-main.670
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8517–8532
Language:
URL:: https://aclanthology.org/2021.emnlp-main.670/
DOI:: 10.18653/v1/2021.emnlp-main.670
Bibkey:
Cite (ACL):: Denis Emelin and Rico Sennrich. 2021. Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8517–8532, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution (Emelin & Sennrich, EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.670.pdf
Software:: 2021.emnlp-main.670.Software.zip
Video:: https://aclanthology.org/2021.emnlp-main.670.mp4

PDF Cite Search Software Video Fix data