Closed Boundary Learning for Classification Tasks with the Universum Class

Hanzhang Zhou, Zijian Feng, Kezhi Mao


Abstract
The Universum class, often known as the *other* class or the*miscellaneous* class, is defined as a collection of samples that do not belong to any class of interest. It is a typical class that exists in many classification-based tasks in NLP, such as relation extraction, named entity recognition, sentiment analysis, etc. The Universum class exhibits very different properties, namely heterogeneity and lack of representativeness in training data; however, existing methods often treat the Universum class equally with the classes of interest, leading to problems such as overfitting, misclassification, and diminished model robustness. In this work, we propose a closed boundary learning method that applies closed decision boundaries to classes of interest and designates the area outside all closed boundaries in the feature space as the space of the Universum class. Specifically, we formulate closed boundaries as arbitrary shapes, propose the inter-class rule-based probability estimation for the Universum class to cater to its unique properties, and propose a boundary learning loss to adjust decision boundaries based on the balance of misclassified samples inside and outside the boundary. In adherence to the natural properties of the Universum class, our method enhances both accuracy and robustness of classification models, demonstrated by improvements on six state-of-the-art works across three different tasks. Our code is available at https://github.com/hzzhou01/Closed-Boundary-Learning.
Anthology ID:
2023.findings-emnlp.1038
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15522–15536
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.1038
DOI:
10.18653/v1/2023.findings-emnlp.1038
Bibkey:
Cite (ACL):
Hanzhang Zhou, Zijian Feng, and Kezhi Mao. 2023. Closed Boundary Learning for Classification Tasks with the Universum Class. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 15522–15536, Singapore. Association for Computational Linguistics.
Cite (Informal):
Closed Boundary Learning for Classification Tasks with the Universum Class (Zhou et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.1038.pdf