Blackbird Language Matrices Tasks for Generalization

Paola Merlo, Chunyang Jiang, Giuseppe Samo, Vivi Nastase


Abstract
To develop a system with near-human language capabilities, we need to understand current systems’ generalisation and compositional abilities. We approach this by generating compositional, structured data, inspired from visual intelligence tests, that depend on the problem-solvers being able to disentangle objects and their absolute and relative properties in a sequence of images. We design an analogous task and develop the corresponding datasets that capture specific linguistic phenomena and their properties. Solving each problem instance depends on detecting the relevant linguistic objects and generative rules of the problem. We propose two datasets modelling two linguistic phenomena – subject-verb agreement in French, and verb alternations in English. The datasets can be used to investigate how LLMs encode linguistic objects, such as phrases, their grammatical and semantic properties, such as number or semantic role, and how such information is combined to correctly solve each problem. Specifically generated error types help investigate the behaviour of the system, which important information it is able to detect, and which structures mislead it.
Anthology ID:
2023.genbench-1.13
Volume:
Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP
Month:
December
Year:
2023
Address:
Singapore
Editors:
Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Koustuv Sinha, Amirhossein Kazemnejad, Christos Christodoulopoulos, Ryan Cotterell, Elia Bruni
Venues:
GenBench | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–172
Language:
URL:
https://aclanthology.org/2023.genbench-1.13
DOI:
10.18653/v1/2023.genbench-1.13
Bibkey:
Cite (ACL):
Paola Merlo, Chunyang Jiang, Giuseppe Samo, and Vivi Nastase. 2023. Blackbird Language Matrices Tasks for Generalization. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, pages 163–172, Singapore. Association for Computational Linguistics.
Cite (Informal):
Blackbird Language Matrices Tasks for Generalization (Merlo et al., GenBench-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.genbench-1.13.pdf
Video:
 https://aclanthology.org/2023.genbench-1.13.mp4