Part-of-Speech Tagging for Twitter with Adversarial Neural Networks

Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang


Abstract
In this work, we study the problem of part-of-speech tagging for Tweets. In contrast to newswire articles, Tweets are usually informal and contain numerous out-of-vocabulary words. Moreover, there is a lack of large scale labeled datasets for this domain. To tackle these challenges, we propose a novel neural network to make use of out-of-domain labeled data, unlabeled in-domain data, and labeled in-domain data. Inspired by adversarial neural networks, the proposed method tries to learn common features through adversarial discriminator. In addition, we hypothesize that domain-specific features of target domain should be preserved in some degree. Hence, the proposed method adopts a sequence-to-sequence autoencoder to perform this task. Experimental results on three different datasets show that our method achieves better performance than state-of-the-art methods.
Anthology ID:
D17-1256
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2411–2420
Language:
URL:
https://aclanthology.org/D17-1256/
DOI:
10.18653/v1/D17-1256
Bibkey:
Cite (ACL):
Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, and Xuanjing Huang. 2017. Part-of-Speech Tagging for Twitter with Adversarial Neural Networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2411–2420, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Part-of-Speech Tagging for Twitter with Adversarial Neural Networks (Gui et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1256.pdf
Video:
 https://aclanthology.org/D17-1256.mp4
Data
Tweebank