Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification

Zihan Wang, Peiyi Wang, Lianzhe Huang, Xin Sun, Houfeng Wang


Abstract
Hierarchical text classification is a challenging subtask of multi-label classification due to its complex label hierarchy. Existing methods encode text and label hierarchy separately and mix their representations for classification, where the hierarchy remains unchanged for all input text. Instead of modeling them separately, in this work, we propose Hierarchy-guided Contrastive Learning (HGCLR) to directly embed the hierarchy into a text encoder. During training, HGCLR constructs positive samples for input text under the guidance of the label hierarchy. By pulling together the input text and its positive sample, the text encoder can learn to generate the hierarchy-aware text representation independently. Therefore, after training, the HGCLR enhanced text encoder can dispense with the redundant hierarchy. Extensive experiments on three benchmark datasets verify the effectiveness of HGCLR.
Anthology ID:
2022.acl-long.491
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7109–7119
Language:
URL:
https://aclanthology.org/2022.acl-long.491
DOI:
10.18653/v1/2022.acl-long.491
Bibkey:
Cite (ACL):
Zihan Wang, Peiyi Wang, Lianzhe Huang, Xin Sun, and Houfeng Wang. 2022. Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7109–7119, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification (Wang et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.491.pdf
Code
 wzh9969/contrastive-htc
Data
New York Times Annotated CorpusRCV1WOS