Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

Xiang Dai, Sarvnaz Karimi


Abstract
Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature. We planned our participation such that it enables us to conduct an empirical comparison between word-based tagging and span-based classification methods. When evaluated on two hidden test sets provided by the organizer, our best-performing submission achieved F1 scores of 0.8307 (validation phase) and 0.7990 (testing phase).
Anthology ID:
2022.wiesp-1.9
Volume:
Proceedings of the first Workshop on Information Extraction from Scientific Publications
Month:
November
Year:
2022
Address:
Online
Editors:
Tirthankar Ghosal, Sergi Blanco-Cuaresma, Alberto Accomazzi, Robert M. Patton, Felix Grezes, Thomas Allen
Venue:
WIESP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
78–83
Language:
URL:
https://aclanthology.org/2022.wiesp-1.9
DOI:
Bibkey:
Cite (ACL):
Xiang Dai and Sarvnaz Karimi. 2022. Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods. In Proceedings of the first Workshop on Information Extraction from Scientific Publications, pages 78–83, Online. Association for Computational Linguistics.
Cite (Informal):
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods (Dai & Karimi, WIESP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wiesp-1.9.pdf