BERTatDE at SemEval-2020 Task 6: Extracting Term-definition Pairs in Free Text Using Pre-trained Model

Huihui Zhang, Feiliang Ren


Abstract
Definition extraction is an important task in Nature Language Processing, and it is used to identify the terms and definitions related to terms. The task contains sentence classification task (i.e., classify whether it contains definition) and sequence labeling task (i.e., find the boundary of terms and definitions). The paper describes our system BERTatDE1 in sentence classification task (subtask 1) and sequence labeling task (subtask 2) in the definition extraction (SemEval-2020 Task 6). We use BERT to solve the multi-domain problems including the uncertainty of term boundary that is, different areas have different ways to definite the domain related terms. We use BERT, BiLSTM and attention in subtask 1 and our best result achieved 79.71% in F1 and the eighteenth place in subtask 1. For the subtask 2, we use BERT, BiLSTM and CRF to sequence labeling, and achieve 40.73% in Macro-averaged F1.
Anthology ID:
2020.semeval-1.90
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
690–696
Language:
URL:
https://aclanthology.org/2020.semeval-1.90
DOI:
10.18653/v1/2020.semeval-1.90
Bibkey:
Cite (ACL):
Huihui Zhang and Feiliang Ren. 2020. BERTatDE at SemEval-2020 Task 6: Extracting Term-definition Pairs in Free Text Using Pre-trained Model. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 690–696, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
BERTatDE at SemEval-2020 Task 6: Extracting Term-definition Pairs in Free Text Using Pre-trained Model (Zhang & Ren, SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.90.pdf
Data
DEFT Corpus