Samridhi Dev
2023
Annotated and Normalized Causal Relation Extraction Corpus for Improving Health Informatics
Samridhi Dev
|
Aditi Sharan
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
In the ever-expanding landscape of biomedical research, development of new cancer drugs has increased the likelihood of adverse drug reactions (ADRs). However, information about these ADRs is often buried in unstructured data, requiring the conversion of this data into a structured and labeled dataset to identify potential ADRs and associations between them, making the extraction of entities and the analysis of causal relations a pivotal task. Machine learning methods have been used to identify ADRs, but current literature has several gaps in coverage, superficial manual annotation, and a lack of a labeled ADR corpus specific to cancer and normalized entities. Current datasets are generated manually on the abstracts, limiting their scope. To address these limitations, the paper presents an algorithm that automatically constructs, annotates, normalizes entities specific to cancer and identifies causal relationships among entities using linguistics and grammatical properties, MetaMap and UMLS tools enabling efficient information retrieval. A further knowledge graph was created for a case report to visualize the causal relationships.