Leveraging Active Learning to Minimise SRL Annotation Across Corpora

Skatje Myers, Martha Palmer


Abstract
In this paper we investigate the application of active learning to semantic role labeling (SRL) using Bayesian Active Learning by Disagreement (BALD). Our new predicate-focused selection method quickly improves efficiency on three different specialised domain corpora. This is encouraging news for researchers wanting to port SRL to domain specific applications. Interestingly, with the large and diverse \textit{OntoNotes} corpus, the sentence selection approach, that collects a larger number of predicates, taking more time to annotate, fares better than the predicate approach. In this paper, we analyze both the selections made by our two selections methods for the various domains and the differences between these corpora in detail.
Anthology ID:
2023.starsem-1.34
Volume:
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Alexis Palmer, Jose Camacho-collados
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
399–408
Language:
URL:
https://aclanthology.org/2023.starsem-1.34
DOI:
10.18653/v1/2023.starsem-1.34
Bibkey:
Cite (ACL):
Skatje Myers and Martha Palmer. 2023. Leveraging Active Learning to Minimise SRL Annotation Across Corpora. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 399–408, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Leveraging Active Learning to Minimise SRL Annotation Across Corpora (Myers & Palmer, *SEM 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.starsem-1.34.pdf