DOC: Deep Open Classification of Text Documents

Lei Shu, Hu Xu, Bing Liu


Abstract
Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.
Anthology ID:
D17-1314
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2911–2916
Language:
URL:
https://aclanthology.org/D17-1314/
DOI:
10.18653/v1/D17-1314
Bibkey:
Cite (ACL):
Lei Shu, Hu Xu, and Bing Liu. 2017. DOC: Deep Open Classification of Text Documents. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2911–2916, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
DOC: Deep Open Classification of Text Documents (Shu et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1314.pdf
Video:
 https://aclanthology.org/D17-1314.mp4