BART-TL: Weakly-Supervised Topic Label Generation

Cristian Popa, Traian Rebedea


Abstract
We propose a novel solution for assigning labels to topic models by using multiple weak labelers. The method leverages generative transformers to learn accurate representations of the most important topic terms and candidate labels. This is achieved by fine-tuning pre-trained BART models on a large number of potential labels generated by state of the art non-neural models for topic labeling, enriched with different techniques. The proposed BART-TL model is able to generate valuable and novel labels in a weakly-supervised manner and can be improved by adding other weak labelers or distant supervision on similar tasks.
Anthology ID:
2021.eacl-main.121
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1418–1425
Language:
URL:
https://aclanthology.org/2021.eacl-main.121
DOI:
10.18653/v1/2021.eacl-main.121
Bibkey:
Cite (ACL):
Cristian Popa and Traian Rebedea. 2021. BART-TL: Weakly-Supervised Topic Label Generation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1418–1425, Online. Association for Computational Linguistics.
Cite (Informal):
BART-TL: Weakly-Supervised Topic Label Generation (Popa & Rebedea, EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.121.pdf