FoTo: Targeted Visual Topic Modeling for Focused Analysis of Short Texts

Sanuj Kumar; Tuan Le

FoTo: Targeted Visual Topic Modeling for Focused Analysis of Short Texts

Abstract

Given a corpus of documents, focused analysis aims to find topics relevant to aspects that a user is interested in. The aspects are often expressed by a set of keywords provided by the user. Short texts such as microblogs and tweets pose several challenges to this task because the sparsity of word co-occurrences may hinder the extraction of meaningful and relevant topics. Moreover, most of the existing topic models perform a full corpus analysis that treats all topics equally, which may make the learned topics not be on target. In this paper, we propose a novel targeted topic model for semantic short-text embedding which aims to learn all topics and low-dimensional visual representations of documents, while preserving relevant topics for focused analysis of short texts. To preserve the relevant topics in the visualization space, we propose jointly modeling topics and the pairwise document ranking based on document-keyword distances in the visualization space. The extensive experiments on several real-world datasets demonstrate the effectiveness of our proposed model in terms of targeted topic modeling and visualization.

Anthology ID:: 2024.lrec-main.653
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 7406–7416
Language:
URL:: https://aclanthology.org/2024.lrec-main.653/
DOI:
Bibkey:
Cite (ACL):: Sanuj Kumar and Tuan Le. 2024. FoTo: Targeted Visual Topic Modeling for Focused Analysis of Short Texts. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7406–7416, Torino, Italia. ELRA and ICCL.
Cite (Informal):: FoTo: Targeted Visual Topic Modeling for Focused Analysis of Short Texts (Kumar & Le, LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.653.pdf

PDF Cite Search Fix data