Active Learning by Acquiring Contrastive Examples

Katerina Margatina, Giorgos Vernikos, Loïc Barrault, Nikolaos Aletras


Abstract
Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively. In this work, leveraging the best of both worlds, we propose an acquisition function that opts for selecting contrastive examples, i.e. data points that are similar in the model feature space and yet the model outputs maximally different predictive likelihoods. We compare our approach, CAL (Contrastive Active Learning), with a diverse set of acquisition functions in four natural language understanding tasks and seven datasets. Our experiments show that CAL performs consistently better or equal than the best performing baseline across all tasks, on both in-domain and out-of-domain data. We also conduct an extensive ablation study of our method and we further analyze all actively acquired datasets showing that CAL achieves a better trade-off between uncertainty and diversity compared to other strategies.
Anthology ID:
2021.emnlp-main.51
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
650–663
Language:
URL:
https://aclanthology.org/2021.emnlp-main.51
DOI:
10.18653/v1/2021.emnlp-main.51
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.51.pdf
Code
 mourga/contrastive-active-learning
Data
AG NewsGLUEIMDb Movie ReviewsQNLISST