Exploring Unified Training Framework for Multimodal User Profiling

Minjie Qiang, Zhongqing Wang, Shoushan Li, Guodong Zhou


Abstract
With the emergence of social media and e-commerce platforms, accurate user profiling has become increasingly vital for recommendation systems and personalized services. Recent studies have focused on generating detailed user profiles by extracting various aspects of user attributes from textual reviews. Nevertheless, these investigations have not fully exploited the potential of the abundant multimodal data at hand. In this study, we propose a novel task called multimodal user profiling. This task emphasizes the utilization of both review texts and their accompanying images to create comprehensive user profiles. By integrating textual and visual data, we leverage their complementary strengths, enabling the generation of more holistic user representations. Additionally, we explore a unified joint training framework with various multimodal training strategies that incorporate users’ historical review texts and images for user profile generation. Our experimental results underscore the significance of multimodal data in enhancing user profile generation and demonstrate the effectiveness of the proposed unified joint training approach.
Anthology ID:
2025.coling-main.115
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1699–1710
Language:
URL:
https://aclanthology.org/2025.coling-main.115/
DOI:
Bibkey:
Cite (ACL):
Minjie Qiang, Zhongqing Wang, Shoushan Li, and Guodong Zhou. 2025. Exploring Unified Training Framework for Multimodal User Profiling. In Proceedings of the 31st International Conference on Computational Linguistics, pages 1699–1710, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Exploring Unified Training Framework for Multimodal User Profiling (Qiang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.115.pdf