Incorporating Emoji Descriptions Improves Tweet Classification

Abhishek Singh, Eduardo Blanco, Wei Jin


Abstract
Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.
Anthology ID:
N19-1214
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2096–2101
Language:
URL:
https://aclanthology.org/N19-1214/
DOI:
10.18653/v1/N19-1214
Bibkey:
Cite (ACL):
Abhishek Singh, Eduardo Blanco, and Wei Jin. 2019. Incorporating Emoji Descriptions Improves Tweet Classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2096–2101, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Incorporating Emoji Descriptions Improves Tweet Classification (Singh et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1214.pdf
Video:
 https://aclanthology.org/N19-1214.mp4