mSCAN: A Dataset for Multilingual Compositional Generalisation Evaluation

Amélie Reymond, Shane Steinert-Threlkeld


Abstract
Language models achieve remarkable results on a variety of tasks, yet still struggle on compositional generalisation benchmarks. The majority of these benchmarks evaluate performance in English only, leaving us with the question of whether these results generalise to other languages. As an initial step to answering this question, we introduce mSCAN, a multilingual adaptation of the SCAN dataset. It was produced by a rule-based translation, developed in cooperation with native speakers. We then showcase this novel dataset on some in-context learning experiments, and GPT3.5 and the multilingual large language model BLOOM
Anthology ID:
2023.genbench-1.11
Volume:
Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP
Month:
December
Year:
2023
Address:
Singapore
Editors:
Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Koustuv Sinha, Amirhossein Kazemnejad, Christos Christodoulopoulos, Ryan Cotterell, Elia Bruni
Venues:
GenBench | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
143–151
Language:
URL:
https://aclanthology.org/2023.genbench-1.11
DOI:
10.18653/v1/2023.genbench-1.11
Bibkey:
Cite (ACL):
Amélie Reymond and Shane Steinert-Threlkeld. 2023. mSCAN: A Dataset for Multilingual Compositional Generalisation Evaluation. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, pages 143–151, Singapore. Association for Computational Linguistics.
Cite (Informal):
mSCAN: A Dataset for Multilingual Compositional Generalisation Evaluation (Reymond & Steinert-Threlkeld, GenBench-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.genbench-1.11.pdf
Video:
 https://aclanthology.org/2023.genbench-1.11.mp4