Towards Computationally Feasible Deep Active Learning

Akim Tsvigun, Artem Shelmanov, Gleb Kuzmin, Leonid Sanochkin, Daniil Larionov, Gleb Gusev, Manvel Avetisian, Leonid Zhukov


Abstract
Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many others. One of such problems is the excessive computational resources required to train an acquisition model and estimate its uncertainty on instances in the unlabeled pool. We propose two techniques that tackle this issue for text classification and tagging tasks, offering a substantial reduction of AL iteration duration and the computational overhead introduced by deep acquisition models in AL. We also demonstrate that our algorithm that leverages pseudo-labeling and distilled models overcomes one of the essential obstacles revealed previously in the literature. Namely, it was shown that due to differences between an acquisition model used to select instances during AL and a successor model trained on the labeled data, the benefits of AL can diminish. We show that our algorithm, despite using a smaller and faster acquisition model, is capable of training a more expressive successor model with higher performance.
Anthology ID:
2022.findings-naacl.90
Volume:
Findings of the Association for Computational Linguistics: NAACL 2022
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1198–1218
Language:
URL:
https://aclanthology.org/2022.findings-naacl.90
DOI:
10.18653/v1/2022.findings-naacl.90
Bibkey:
Cite (ACL):
Akim Tsvigun, Artem Shelmanov, Gleb Kuzmin, Leonid Sanochkin, Daniil Larionov, Gleb Gusev, Manvel Avetisian, and Leonid Zhukov. 2022. Towards Computationally Feasible Deep Active Learning. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1198–1218, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Towards Computationally Feasible Deep Active Learning (Tsvigun et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-naacl.90.pdf
Video:
 https://aclanthology.org/2022.findings-naacl.90.mp4
Code
 airi-institute/al_nlp_feasible
Data
AG NewsCoNLL 2003IMDb Movie ReviewsOntoNotes 5.0