Learning the Dyck Language with Attention-based Seq2Seq Models

Xiang Yu, Ngoc Thang Vu, Jonas Kuhn


Abstract
The generalized Dyck language has been used to analyze the ability of Recurrent Neural Networks (RNNs) to learn context-free grammars (CFGs). Recent studies draw conflicting conclusions on their performance, especially regarding the generalizability of the models with respect to the depth of recursion. In this paper, we revisit several common models and experimental settings, discuss the potential problems of the tasks and analyses. Furthermore, we explore the use of attention mechanisms within the seq2seq framework to learn the Dyck language, which could compensate for the limited encoding ability of RNNs. Our findings reveal that attention mechanisms still cannot truly generalize over the recursion depth, although they perform much better than other models on the closing bracket tagging task. Moreover, this also suggests that this commonly used task is not sufficient to test a model’s understanding of CFGs.
Anthology ID:
W19-4815
Volume:
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Tal Linzen, Grzegorz Chrupała, Yonatan Belinkov, Dieuwke Hupkes
Venue:
BlackboxNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
138–146
Language:
URL:
https://aclanthology.org/W19-4815
DOI:
10.18653/v1/W19-4815
Bibkey:
Cite (ACL):
Xiang Yu, Ngoc Thang Vu, and Jonas Kuhn. 2019. Learning the Dyck Language with Attention-based Seq2Seq Models. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 138–146, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Learning the Dyck Language with Attention-based Seq2Seq Models (Yu et al., BlackboxNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4815.pdf