DISTANT: Distantly Supervised Entity Span Detection and Classification

Ken Yano, Makoto Miwa, Sophia Ananiadou


Abstract
We propose a distantly supervised pipeline NER which executes entity span detection and entity classification in sequence named DISTANT (DIstantly Supervised enTity spAN deTection and classification).The former entity span detector extracts possible entity mention spans by the distant supervision. Then the later entity classifier assigns each entity span to one of the positive entity types or none by employing a positive and unlabeled (PU) learning framework. Two models were built based on the pre-trained SciBERT model and fine-tuned with the silver corpus generated by the distant supervision. Experimental results on BC5CDR and NCBI-Disease datasets show that our method outperforms the end-to-end NER baselines without PU learning by a large margin. In particular, it increases the recall score effectively.
Anthology ID:
2023.bionlp-1.14
Volume:
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
171–177
Language:
URL:
https://aclanthology.org/2023.bionlp-1.14
DOI:
10.18653/v1/2023.bionlp-1.14
Bibkey:
Cite (ACL):
Ken Yano, Makoto Miwa, and Sophia Ananiadou. 2023. DISTANT: Distantly Supervised Entity Span Detection and Classification. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 171–177, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
DISTANT: Distantly Supervised Entity Span Detection and Classification (Yano et al., BioNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bionlp-1.14.pdf