Solving SCAN Tasks with Data Augmentation and Input Embeddings

Michal Auersperger, Pavel Pecina


Abstract
We address the compositionality challenge presented by the SCAN benchmark. Using data augmentation and a modification of the standard seq2seq architecture with attention, we achieve SOTA results on all the relevant tasks from the benchmark, showing the models can generalize to words used in unseen contexts. We propose an extension of the benchmark by a harder task, which cannot be solved by the proposed method.
Anthology ID:
2021.ranlp-1.11
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
86–91
Language:
URL:
https://aclanthology.org/2021.ranlp-1.11
DOI:
Bibkey:
Cite (ACL):
Michal Auersperger and Pavel Pecina. 2021. Solving SCAN Tasks with Data Augmentation and Input Embeddings. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 86–91, Held Online. INCOMA Ltd..
Cite (Informal):
Solving SCAN Tasks with Data Augmentation and Input Embeddings (Auersperger & Pecina, RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-1.11.pdf
Code
 michal-au/scan-around-twice