Detecting Diabetes Risk from Social Media Activity

Dane Bell, Egoitz Laparra, Aditya Kousik, Terron Ishihara, Mihai Surdeanu, Stephen Kobourov


Abstract
This work explores the detection of individuals’ risk of type 2 diabetes mellitus (T2DM) directly from their social media (Twitter) activity. Our approach extends a deep learning architecture with several contributions: following previous observations that language use differs by gender, it captures and uses gender information through domain adaptation; it captures recency of posts under the hypothesis that more recent posts are more representative of an individual’s current risk status; and, lastly, it demonstrates that in this scenario where activity factors are sparsely represented in the data, a bag-of-word neural network model using custom dictionaries of food and activity words performs better than other neural sequence models. Our best model, which incorporates all these contributions, achieves a risk-detection F1 of 41.9, considerably higher than the baseline rate (36.9).
Anthology ID:
W18-5601
Volume:
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis
Month:
October
Year:
2018
Address:
Brussels, Belgium
Venues:
EMNLP | Louhi | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/W18-5601
DOI:
10.18653/v1/W18-5601
Bibkey:
Cite (ACL):
Dane Bell, Egoitz Laparra, Aditya Kousik, Terron Ishihara, Mihai Surdeanu, and Stephen Kobourov. 2018. Detecting Diabetes Risk from Social Media Activity. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, pages 1–11, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Detecting Diabetes Risk from Social Media Activity (Bell et al., 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5601.pdf