Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings

Hwiyeol Jo; Ceyda Cinarel

doi:10.18653/v1/D19-1347

Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings

Abstract

We propose a novel and simple method for semi-supervised text classification. The method stems from the hypothesis that a classifier with pretrained word embeddings always outperforms the same classifier with randomly initialized word embeddings, as empirically observed in NLP tasks. Our method first builds two sets of classifiers as a form of model ensemble, and then initializes their word embeddings differently: one using random, the other using pretrained word embeddings. We focus on different predictions between the two classifiers on unlabeled data while following the self-training framework. We also use early-stopping in meta-epoch to improve the performance of our method. Our method, Delta-training, outperforms the self-training and the co-training framework in 4 different text classification datasets, showing robustness against error accumulation.

Anthology ID:: D19-1347
Volume:: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:: EMNLP | IJCNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3458–3463
Language:
URL:: https://aclanthology.org/D19-1347/
DOI:: 10.18653/v1/D19-1347
Bibkey:
Cite (ACL):: Hwiyeol Jo and Ceyda Cinarel. 2019. Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3458–3463, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings (Jo & Cinarel, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-1347.pdf

PDF Cite Search Fix data