Building a Video-and-Language Dataset with Human Actions for Multimodal Logical Inference

Building a Video-and-Language Dataset with Human Actions for Multimodal Logical Inference Riko Suzuki author Hitomi Yanaka author Koji Mineshima author Daisuke Bekki author 2021-06 text Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR) Lucia Donatelli editor Nikhil Krishnaswamy editor Kenneth Lai editor James Pustejovsky editor Association for Computational Linguistics Groningen, Netherlands (Online) conference publication suzuki-etal-2021-building https://aclanthology.org/2021.mmsr-1.10/ 2021-06 102 107