ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection

Sunjae Kwon; Xun Wang; Weisong Liu; Emily Druhl; Minhee Sung; Joel Reisman; Wenjun Li; Robert Kerns; William Becker; Hong Yu

doi:10.18653/v1/2024.naacl-long.244

ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection

Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, Hong Yu

Abstract

Opioid related aberrant behaviors (ORABs) present novel risk factors for opioid overdose. This paper introduces a novel biomedical natural language processing benchmark dataset named ODD, for ORAB Detection Dataset. ODD is an expert-annotated dataset designed to identify ORABs from patients’ EHR notes and classify them into nine categories; 1) Confirmed Aberrant Behavior, 2) Suggested Aberrant Behavior, 3) Opioids, 4) Indication, 5) Diagnosed opioid dependency, 6) Benzodiazepines, 7) Medication Changes, 8) Central Nervous System-related, and 9) Social Determinants of Health. We explored two state-of-the-art natural language processing models (fine-tuning and prompt-tuning approaches) to identify ORAB. Experimental results show that the prompt-tuning models outperformed the fine-tuning models in most categories and the gains were especially higher among uncommon categories (Suggested Aberrant Behavior, Confirmed Aberrant Behaviors, Diagnosed Opioid Dependence, and Medication Change). Although the best model achieved the highest 88.17% on macro average area under precision recall curve, uncommon classes still have a large room for performance improvement. ODD is publicly available.

Anthology ID:: 2024.naacl-long.244
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4338–4359
Language:
URL:: https://aclanthology.org/2024.naacl-long.244
DOI:: 10.18653/v1/2024.naacl-long.244
Bibkey:
Cite (ACL):: Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, and Hong Yu. 2024. ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4338–4359, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection (Kwon et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.244.pdf

PDF Cite Search