Yoichi Aoki


2023

pdf bib
Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?
Keito Kudo | Yoichi Aoki | Tatsuki Kuribayashi | Ana Brassard | Masashi Yoshikawa | Keisuke Sakaguchi | Kentaro Inui
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Compositionality is a pivotal property of symbolic reasoning. However, how well recent neural models capture compositionality remains underexplored in the symbolic reasoning tasks. This study empirically addresses this question by systematically examining recently published pre-trained seq2seq models with a carefully controlled dataset of multi-hop arithmetic symbolic reasoning. We introduce a skill tree on compositionality in arithmetic symbolic reasoning that defines the hierarchical levels of complexity along with three compositionality dimensions: systematicity, productivity, and substitutivity. Our experiments revealed that among the three types of composition, the models struggled most with systematicity, performing poorly even with relatively simple compositions. That difficulty was not resolved even after training the models with intermediate reasoning steps.

pdf bib
Empirical Investigation of Neural Symbolic Reasoning Strategies
Yoichi Aoki | Keito Kudo | Tatsuki Kuribayashi | Ana Brassard | Masashi Yoshikawa | Keisuke Sakaguchi | Kentaro Inui
Findings of the Association for Computational Linguistics: EACL 2023

Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1, B=3, C=A+3, C?), we found that the choice of reasoning strategies significantly affects the performance, with the gap becoming even larger as the extrapolation length becomes longer. Surprisingly, we also found that certain configurations lead to nearly perfect performance, even in the case of length extrapolation. Our results indicate the importance of further exploring effective strategies for neural reasoning models.