PharmaCoNER: Pharmacological Substances, Compounds and proteins Named Entity Recognition track

Aitor González-Agirre; Montserrat Marimon; Ander Intxaurrondo; Obdulia Rabal; Marta Villegas; Martin Krallinger

doi:10.18653/v1/D19-5701

PharmaCoNER: Pharmacological Substances, Compounds and proteins Named Entity Recognition track

Aitor Gonzalez-Agirre, Montserrat Marimon, Ander Intxaurrondo, Obdulia Rabal, Marta Villegas, Martin Krallinger

Abstract

One of the biomedical entity types of relevance for medicine or biosciences are chemical compounds and drugs. The correct detection these entities is critical for other text mining applications building on them, such as adverse drug-reaction detection, medication-related fake news or drug-target extraction. Although a significant effort was made to detect mentions of drugs/chemicals in English texts, so far only very limited attempts were made to recognize them in medical documents in other languages. Taking into account the growing amount of medical publications and clinical records written in Spanish, we have organized the first shared task on detecting drug and chemical entities in Spanish medical documents. Additionally, we included a clinical concept-indexing sub-track asking teams to return SNOMED-CT identifiers related to drugs/chemicals for a collection of documents. For this task, named PharmaCoNER, we generated annotation guidelines together with a corpus of 1,000 manually annotated clinical case studies. A total of 22 teams participated in the sub-track 1, (77 system runs), and 7 teams in the sub-track 2 (19 system runs). Top scoring teams used sophisticated deep learning approaches yielding very competitive results with F-measures above 0.91. These results indicate that there is a real interest in promoting biomedical text mining efforts beyond English. We foresee that the PharmaCoNER annotation guidelines, corpus and participant systems will foster the development of new resources for clinical and biomedical text mining systems of Spanish medical data.

Anthology ID:: D19-5701
Volume:: Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Kim Jin-Dong, Nédellec Claire, Bossy Robert, Deléger Louise
Venue:: BioNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–10
Language:
URL:: https://aclanthology.org/D19-5701/
DOI:: 10.18653/v1/D19-5701
Bibkey:
Cite (ACL):: Aitor Gonzalez-Agirre, Montserrat Marimon, Ander Intxaurrondo, Obdulia Rabal, Marta Villegas, and Martin Krallinger. 2019. PharmaCoNER: Pharmacological Substances, Compounds and proteins Named Entity Recognition track. In Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pages 1–10, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: PharmaCoNER: Pharmacological Substances, Compounds and proteins Named Entity Recognition track (Gonzalez-Agirre et al., BioNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-5701.pdf

PDF Cite Search Fix data