Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines

Mahmoud Azab, Stephane Dadian, Vivi Nastase, Larry An, Rada Mihalcea


Abstract
We introduce a new dataset consisting of natural language interactions annotated with medical family histories, obtained during interactions with a genetic counselor and through crowdsourcing, following a questionnaire created by experts in the domain. We describe the data collection process and the annotations performed by medical professionals, including illness and personal attributes (name, age, gender, family relationships) for the patient and their family members. An initial system that performs argument identification and relation extraction shows promising results – average F-score of 0.87 on complex sentences on the targeted relations.
Anthology ID:
D19-1122
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1255–1260
Language:
URL:
https://aclanthology.org/D19-1122/
DOI:
10.18653/v1/D19-1122
Bibkey:
Cite (ACL):
Mahmoud Azab, Stephane Dadian, Vivi Nastase, Larry An, and Rada Mihalcea. 2019. Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1255–1260, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines (Azab et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-1122.pdf