NUS-IDS at CASE 2021 Task 1: Improving Multilingual Event Sentence Coreference Identification With Linguistic Information

Fiona Anting Tan, Sujatha Das Gollapalli, See-Kiong Ng


Abstract
Event Sentence Coreference Identification (ESCI) aims to cluster event sentences that refer to the same event together for information extraction. We describe our ESCI solution developed for the ACL-CASE 2021 shared tasks on the detection and classification of socio-political and crisis event information in a multilingual setting. For a given article, our proposed pipeline comprises of an accurate sentence pair classifier that identifies coreferent sentence pairs and subsequently uses these predicted probabilities to cluster sentences into groups. Sentence pair representations are constructed from fine-tuned BERT embeddings plus POS embeddings fed through a BiLSTM model, and combined with linguistic-based lexical and semantic similarities between sentences. Our best models ranked 2nd, 1st and 2nd and obtained CoNLL F1 scores of 81.20%, 93.03%, 83.15% for the English, Portuguese and Spanish test sets respectively in the ACL-CASE 2021 competition.
Anthology ID:
2021.case-1.14
Volume:
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | CASE | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
105–112
Language:
URL:
https://aclanthology.org/2021.case-1.14
DOI:
10.18653/v1/2021.case-1.14
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.case-1.14.pdf
Code
 nus-ids/eventsentencecoref