Self-Supervised Multimodal Opinion Summarization

Jinbae Im; Moonki Kim; Hoyeop Lee; Hyunsouk Cho; Sehee Chung

doi:10.18653/v1/2021.acl-long.33

Self-Supervised Multimodal Opinion Summarization

Jinbae Im, Moonki Kim, Hoyeop Lee, Hyunsouk Cho, Sehee Chung

Abstract

Recently, opinion summarization, which is the generation of a summary from multiple reviews, has been conducted in a self-supervised manner by considering a sampled review as a pseudo summary. However, non-text data such as image and metadata related to reviews have been considered less often. To use the abundant information contained in non-text data, we propose a self-supervised multimodal opinion summarization framework called MultimodalSum. Our framework obtains a representation of each modality using a separate encoder for each modality, and the text decoder generates a summary. To resolve the inherent heterogeneity of multimodal data, we propose a multimodal training pipeline. We first pretrain the text encoder–decoder based solely on text modality data. Subsequently, we pretrain the non-text modality encoders by considering the pretrained text decoder as a pivot for the homogeneous representation of multimodal data. Finally, to fuse multimodal representations, we train the entire framework in an end-to-end manner. We demonstrate the superiority of MultimodalSum by conducting experiments on Yelp and Amazon datasets.

Anthology ID:: 2021.acl-long.33
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 388–403
Language:
URL:: https://aclanthology.org/2021.acl-long.33/
DOI:: 10.18653/v1/2021.acl-long.33
Bibkey:
Cite (ACL):: Jinbae Im, Moonki Kim, Hoyeop Lee, Hyunsouk Cho, and Sehee Chung. 2021. Self-Supervised Multimodal Opinion Summarization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 388–403, Online. Association for Computational Linguistics.
Cite (Informal):: Self-Supervised Multimodal Opinion Summarization (Im et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.acl-long.33.pdf
Video:: https://aclanthology.org/2021.acl-long.33.mp4

PDF Cite Search Video Fix data