Advancing Active Learning with Ensemble Strategies

Naif Alatrush; Sultan Alsarra; Afraa Alshammari; Luay Abdeljaber; Niamat Zawad; Latifur Khan; Patrick T. Brandt; Javier Osorio; Vito D’Orazio

Advancing Active Learning with Ensemble Strategies

Naif Alatrush, Sultan Alsarra, Afraa Alshammari, Luay Abdeljaber, Niamat Zawad, Latifur Khan, Patrick T. Brandt, Javier Osorio, Vito D’Orazio

Abstract

Active learning (AL) reduces annotation costs by selecting the most informative samples for labeling. However, traditional AL methods rely on a single heuristic, limiting data exploration and annotation efficiency. This paper introduces two ensemble-based AL methods: Ensemble Union, which combines multiple heuristics to improve dataset exploration, and Ensemble Intersection, which applies majority voting for robust sample selection. We evaluate these approaches on the United Nations Parallel Corpus (UNPC) in both English and Spanish using domain-specific models such as ConfliBERT. Our results show that ensemble-based AL strategies outperform individual heuristics, achieving classification performance comparable to full dataset training while using significantly fewer labeled examples. Although focused on political texts, the proposed methods are applicable to broader NLP annotation tasks where labeling costs are high.

Anthology ID:: 2025.ranlp-1.7
Volume:: Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:: September
Year:: 2025
Address:: Varna, Bulgaria
Editors:: Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 57–66
Language:
URL:: https://aclanthology.org/2025.ranlp-1.7/
DOI:
Bibkey:
Cite (ACL):: Naif Alatrush, Sultan Alsarra, Afraa Alshammari, Luay Abdeljaber, Niamat Zawad, Latifur Khan, Patrick T. Brandt, Javier Osorio, and Vito D’Orazio. 2025. Advancing Active Learning with Ensemble Strategies. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 57–66, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: Advancing Active Learning with Ensemble Strategies (Alatrush et al., RANLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.ranlp-1.7.pdf

PDF Cite Search Fix data