Neural Multimodal Topic Modeling: A Comprehensive Evaluation

Felipe Gonzalez-Pizarro; Giuseppe Carenini

Neural Multimodal Topic Modeling: A Comprehensive Evaluation

Felipe Gonzalez-Pizarro, Giuseppe Carenini

Abstract

Neural topic models can successfully find coherent and diverse topics in textual data. However, they are limited in dealing with multimodal datasets (e.g., images and text). This paper presents the first systematic and comprehensive evaluation of multimodal topic modeling of documents containing both text and images. In the process, we propose two novel topic modeling solutions and two novel evaluation metrics. Overall, our evaluation on an unprecedented rich and diverse collection of datasets indicates that both of our models generate coherent and diverse topics. Nevertheless, the extent to which one method outperforms the other depends on the metrics and dataset combinations, which suggests further exploration of hybrid solutions in the future. Notably, our succinct human evaluation aligns with the outcomes determined by our proposed metrics. This alignment not only reinforces the credibility of our metrics but also highlights the potential for their application in guiding future multimodal topic modeling endeavors.

Anthology ID:: 2024.lrec-main.1064
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 12159–12172
Language:
URL:: https://aclanthology.org/2024.lrec-main.1064
DOI:
Bibkey:
Cite (ACL):: Felipe Gonzalez-Pizarro and Giuseppe Carenini. 2024. Neural Multimodal Topic Modeling: A Comprehensive Evaluation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12159–12172, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Neural Multimodal Topic Modeling: A Comprehensive Evaluation (Gonzalez-Pizarro & Carenini, LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.1064.pdf

PDF Cite Search