Contextualized Topic Coherence Metrics

Hamed Rahimi, David Mimno, Jacob Hoover, Hubert Naacke, Camelia Constantin, Bernd Amann


Abstract
This article proposes a new family of LLM-based topic coherence metrics called Contextualized Topic Coherence (CTC) and inspired by standard human topic evaluation methods. CTC metrics simulate human-centered coherence evaluation while maintaining the efficiency of other automated methods. We compare the performance of our CTC metrics and five other baseline metrics on seven topic models and show that CTC metrics better reflect human judgment, particularly for topics extracted from short text collections by avoiding highly scored topics that are meaningless to humans.
Anthology ID:
2024.findings-eacl.123
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1760–1773
Language:
URL:
https://aclanthology.org/2024.findings-eacl.123
DOI:
Bibkey:
Cite (ACL):
Hamed Rahimi, David Mimno, Jacob Hoover, Hubert Naacke, Camelia Constantin, and Bernd Amann. 2024. Contextualized Topic Coherence Metrics. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1760–1773, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Contextualized Topic Coherence Metrics (Rahimi et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-eacl.123.pdf
Software:
 2024.findings-eacl.123.software.zip