Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results

Ian Magnusson, Scott Friedman


Abstract
Recent transformer-based approaches demonstrate promising results on relational scientific information extraction. Existing datasets focus on high-level description of how research is carried out. Instead we focus on the subtleties of how experimental associations are presented by building SciClaim, a dataset of scientific claims drawn from Social and Behavior Science (SBS), PubMed, and CORD-19 papers. Our novel graph annotation schema incorporates not only coarse-grained entity spans as nodes and relations as edges between them, but also fine-grained attributes that modify entities and their relations, for a total of 12,738 labels in the corpus. By including more label types and more than twice the label density of previous datasets, SciClaim captures causal, comparative, predictive, statistical, and proportional associations over experimental variables along with their qualifications, subtypes, and evidence. We extend work in transformer-based joint entity and relation extraction to effectively infer our schema, showing the promise of fine-grained knowledge graphs in scientific claims and beyond.
Anthology ID:
2021.emnlp-main.381
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4651–4658
Language:
URL:
https://aclanthology.org/2021.emnlp-main.381
DOI:
10.18653/v1/2021.emnlp-main.381
Bibkey:
Cite (ACL):
Ian Magnusson and Scott Friedman. 2021. Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4651–4658, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results (Magnusson & Friedman, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.381.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.381.mp4
Code
 siftech/sciclaim
Data
CORD-19SciERCSciREXSemEval-2017 Task-10