Amélie Reymond
2023
mSCAN: A Dataset for Multilingual Compositional Generalisation Evaluation
Amélie Reymond
|
Shane Steinert-Threlkeld
Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP
Language models achieve remarkable results on a variety of tasks, yet still struggle on compositional generalisation benchmarks. The majority of these benchmarks evaluate performance in English only, leaving us with the question of whether these results generalise to other languages. As an initial step to answering this question, we introduce mSCAN, a multilingual adaptation of the SCAN dataset. It was produced by a rule-based translation, developed in cooperation with native speakers. We then showcase this novel dataset on some in-context learning experiments, and GPT3.5 and the multilingual large language model BLOOM as well as gpt3.5-turbo.