UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?

Jonathan Beaulieu, Dennis Asamoah Owusu


Abstract
In this paper, we present our system for assigning an emoji to a tweet based on the text. Each tweet was originally posted with an emoji which the task providers removed. Our task was to decide out of 20 emojis, which originally came with the tweet. Two datasets were provided - one in English and the other in Spanish. We treated the task as a standard classification task with the emojis as our classes and the tweets as our documents. Our best performing system used a Bag of Words model with a Linear Support Vector Machine as its’ classifier. We achieved a macro F1 score of 32.73% for the English data and 17.98% for the Spanish data.
Anthology ID:
S18-1061
Volume:
Proceedings of the 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
400–404
Language:
URL:
https://aclanthology.org/S18-1061
DOI:
10.18653/v1/S18-1061
Bibkey:
Cite (ACL):
Jonathan Beaulieu and Dennis Asamoah Owusu. 2018. UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 400–404, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices? (Beaulieu & Asamoah Owusu, SemEval 2018)
Copy Citation:
PDF:
https://aclanthology.org/S18-1061.pdf