Yuan-Hao Cheng


2023

This study describes the model design of the NCUEE-NLP system for the SemEval-2023 NLI4CT task that focuses on multi-evidence natural language inference for clinical trial data. We use the LinkBERT transformer in the biomedical domain (denoted as BioLinkBERT) as our main system architecture. First, a set of sentences in clinical trial reports is extracted as evidence for premise-statement inference. This identified evidence is then used to determine the inference relation (i.e., entailment or contradiction). Finally, a soft voting ensemble mechanism is applied to enhance the system performance. For Subtask 1 on textual entailment, our best submission had an F1-score of 0.7091, ranking sixth among all 30 participating teams. For Subtask 2 on evidence retrieval, our best result obtained an F1-score of 0.7940, ranking ninth of 19 submissions.
This study describes the model design of the NCUEE-NLP system for the SemEval-2023 Task 8. We use the pre-trained transformer models and fine-tune the task datasets to identify medical causal claims and extract population, intervention, and outcome elements in a Reddit post when a claim is given. Our best system submission for the causal claim identification subtask achieved a F1-score of 70.15%. Our best submission for the PIO frame extraction subtask achieved F1-scores of 37.78% for Population class, 43.58% for Intervention class, and 30.67% for Outcome class, resulting in a macro-averaging F1-score of 37.34%. Our system evaluation results ranked second position among all participating teams.