Structure Learning for Neural Module Networks

Vardaan Pahuja, Jie Fu, Sarath Chandar, Christopher Pal


Abstract
Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that involve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. We utilize a minimum amount of prior knowledge from the human-specified neural modules in the form of different input types and arithmetic operators used in these modules. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules. In addition, we do a analysis of sensitivity of the learned modules w.r.t. the arithmetic operations and infer the analytical expressions of the learned modules.
Anthology ID:
D19-6401
Volume:
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/D19-6401
DOI:
10.18653/v1/D19-6401
Bibkey:
Cite (ACL):
Vardaan Pahuja, Jie Fu, Sarath Chandar, and Christopher Pal. 2019. Structure Learning for Neural Module Networks. In Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN), pages 1–10, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Structure Learning for Neural Module Networks (Pahuja et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-6401.pdf
Attachment:
 D19-6401.Attachment.zip
Data
CLEVRCLEVR-HumansVisual Question AnsweringVisual Question Answering v2.0