Drug-Use Identification from Tweets with Word and Character N-Grams

Çağrı Çöltekin, Taraka Rama


Abstract
This paper describes our systems in social media mining for health applications (SMM4H) shared task. We participated in all four tracks of the shared task using linear models with a combination of character and word n-gram features. We did not use any external data or domain specific information. The resulting systems achieved above-average scores among other participating systems, with F1-scores of 91.22, 46.8, 42.4, and 85.53 on tasks 1, 2, 3, and 4 respectively.
Anthology ID:
W18-5914
Volume:
Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Graciela Gonzalez-Hernandez, Davy Weissenbacher, Abeed Sarker, Michael Paul
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
52–53
Language:
URL:
https://aclanthology.org/W18-5914
DOI:
10.18653/v1/W18-5914
Bibkey:
Cite (ACL):
Çağrı Çöltekin and Taraka Rama. 2018. Drug-Use Identification from Tweets with Word and Character N-Grams. In Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task, pages 52–53, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Drug-Use Identification from Tweets with Word and Character N-Grams (Çöltekin & Rama, EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5914.pdf