Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems Hung Le author Doyen Sahoo author Nancy Chen author Steven Hoi author 2019-07 text Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics Anna Korhonen editor David Traum editor Lluís Màrquez editor Association for Computational Linguistics Florence, Italy conference publication le-etal-2019-multimodal 10.18653/v1/P19-1564 https://aclanthology.org/P19-1564/ 2019-07 5612 5623