Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal


Abstract
Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally generalize. In this paper, we focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance, thereby showing that their compositional generalization abilities were previously underestimated. We perform detailed empirical analysis of this phenomenon. Our results indicate that the generalization performance of models is highly sensitive to the characteristics of the training data which should be carefully considered while designing such benchmarks in future.
Anthology ID:
2022.acl-short.46
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
424–434
Language:
URL:
https://aclanthology.org/2022.acl-short.46
DOI:
10.18653/v1/2022.acl-short.46
Bibkey:
Cite (ACL):
Arkil Patel, Satwik Bhattamishra, Phil Blunsom, and Navin Goyal. 2022. Revisiting the Compositional Generalization Abilities of Neural Sequence Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 424–434, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Revisiting the Compositional Generalization Abilities of Neural Sequence Models (Patel et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-short.46.pdf
Software:
 2022.acl-short.46.software.zip
Video:
 https://aclanthology.org/2022.acl-short.46.mp4
Code
 arkilpatel/compositional-generalization-seq2seq
Data
SCAN