FIDELITY: Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding

Divyansh Singh; Brodie Mather; Demi Zhang; Patrick Lehman; Justin Ho; Bonnie J Dorr

doi:10.18653/v1/2025.findings-naacl.132

FIDELITY: Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding

Divyansh Singh, Brodie Mather, Demi Zhang, Patrick Lehman, Justin Ho, Bonnie J Dorr

Abstract

The rapid expansion of text data has increased the need for effective methods to distill meaningful information from large datasets. Traditional and state-of-the-art approaches have made significant strides in topic modeling, yet they fall short in generating contextually specific and semantically intuitive topics, particularly in dynamic environments and low-resource languages. Additionally, multi-document summarization systems often struggle with issues like redundancy, scalability, and maintaining readability. We introduce FIDELITY (Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding), a hybrid method that combines topic modeling and text summarization to produce fine-grained, semantically rich, and contextually relevant output. FIDELITY enhances dataset accessibility and interpretability, outperforming traditional models in topic diversity, similarity, and in the ability to process new, unseen documents. Additionally, it demonstrates robust multilingual capabilities, effectively handling low-resource languages like Tagalog. This makes FIDELITY a powerful tool for distilling and understanding complex textual data, providing detailed insights while maintaining the necessary granularity for practical applications.

Anthology ID:: 2025.findings-naacl.132
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2460–2472
Language:
URL:: https://aclanthology.org/2025.findings-naacl.132/
DOI:: 10.18653/v1/2025.findings-naacl.132
Bibkey:
Cite (ACL):: Divyansh Singh, Brodie Mather, Demi Zhang, Patrick Lehman, Justin Ho, and Bonnie J Dorr. 2025. FIDELITY: Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 2460–2472, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: FIDELITY: Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding (Singh et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.132.pdf

PDF Cite Search Fix data