Hengru Xu
2019
Enhancing Local Feature Extraction with Global Representation for Neural Text Classification
Guocheng Niu
|
Hengru Xu
|
Bolei He
|
Xinyan Xiao
|
Hua Wu
|
Sheng Gao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
For text classification, traditional local feature driven models learn long dependency by deeply stacking or hybrid modeling. This paper proposes a novel Encoder1-Encoder2 architecture, where global information is incorporated into the procedure of local feature extraction from scratch. In particular, Encoder1 serves as a global information provider, while Encoder2 performs as a local feature extractor and is directly fed into the classifier. Meanwhile, two modes are also designed for their interaction. Thanks to the awareness of global information, our method is able to learn better instance specific local features and thus avoids complicated upper operations. Experiments conducted on eight benchmark datasets demonstrate that our proposed architecture promotes local feature driven models by a substantial margin and outperforms the previous best models in the fully-supervised setting.
2018
From Random to Supervised: A Novel Dropout Mechanism Integrated with Global Information
Hengru Xu
|
Shen Li
|
Renfen Hu
|
Si Li
|
Sheng Gao
Proceedings of the 22nd Conference on Computational Natural Language Learning
Dropout is used to avoid overfitting by randomly dropping units from the neural networks during training. Inspired by dropout, this paper presents GI-Dropout, a novel dropout method integrating with global information to improve neural networks for text classification. Unlike the traditional dropout method in which the units are dropped randomly according to the same probability, we aim to use explicit instructions based on global information of the dataset to guide the training process. With GI-Dropout, the model is supposed to pay more attention to inapparent features or patterns. Experiments demonstrate the effectiveness of the dropout with global information on seven text classification tasks, including sentiment analysis and topic classification.