Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training

Zhisong Zhang, Emma Strubell, Eduard Hovy


Abstract
In this work we propose a pragmatic method that reduces the annotation cost for structured label spaces using active learning. Our approach leverages partial annotation, which reduces labeling costs for structured outputs by selecting only the most informative sub-structures for annotation. We also utilize self-training to incorporate the current model’s automatic predictions as pseudo-labels for un-annotated sub-structures. A key challenge in effectively combining partial annotation with self-training to reduce annotation cost is determining which sub-structures to select to label. To address this challenge, we adopt an error estimator to adaptively decide the partial selection ratio according to the current model’s capability. In evaluations spanning four structured prediction tasks, we show that our combination of partial annotation and self-training using an adaptive selection ratio reduces annotation cost over strong full annotation baselines under a fair comparison scheme that takes reading time into consideration.
Anthology ID:
2023.findings-emnlp.865
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12991–13008
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.865
DOI:
10.18653/v1/2023.findings-emnlp.865
Bibkey:
Cite (ACL):
Zhisong Zhang, Emma Strubell, and Eduard Hovy. 2023. Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 12991–13008, Singapore. Association for Computational Linguistics.
Cite (Informal):
Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training (Zhang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.865.pdf