A-SEA3𝐋-QA: A Fully Automated Self-Evolving, Adversarial Workflow for Arabic Long-Context Question-Answer Generation

Kesen Wang; Daulet Toibazar; Pedro J Moreno Mengibar

doi:10.18653/v1/2025.arabicnlp-main.9

A-SEA³𝐋-QA: A Fully Automated Self-Evolving, Adversarial Workflow for Arabic Long-Context Question-Answer Generation

Kesen Wang, Daulet Toibazar, Pedro J Moreno Mengibar

Abstract

We present an end-to-end, self-evolving adversarial workflow for long-context Question-Answer (QA) Generation in Arabic. By orchestrating multiple specialized LVLMs: a question generator, an evaluator, and a swarm of answer generators, our system iteratively refines its own performance without any human intervention. Starting from raw, multi-page Arabic documents across diverse domains, the question generator produces fine-grained, context-aware queries to be tackled by the answer generator swarm, and the evaluator assesses and feeds back quality metrics. This closed-loop cycle enables continuous learning: low-confidence outputs trigger automated re-generation and model updates, progressively enhancing question difficulty and relevance. Moreover, we set the quality metrics as a tunable hyperparameter, enabling question generation at controllable and customizable difficulty levels. We release AraLongBench, a large-scale Arabic benchmark of single- and multi-page challenges spanning hundreds of pages, and demonstrate that our self-evolving workflow substantially outperform static pipelines, markedly boosting the long-context comprehension capabilities of leading Arabic Large Vision Language Models (LVLMs). Lastly, we also meticulously architect a fully automated agentic workflow for long-context Arabic document collection.

Anthology ID:: 2025.arabicnlp-main.9
Volume:: Proceedings of The Third Arabic Natural Language Processing Conference
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:: ArabicNLP
SIG:: SIGARAB
Publisher:: Association for Computational Linguistics
Note:
Pages:: 107–116
Language:
URL:: https://aclanthology.org/2025.arabicnlp-main.9/
DOI:: 10.18653/v1/2025.arabicnlp-main.9
Bibkey:
Cite (ACL):: Kesen Wang, Daulet Toibazar, and Pedro J Moreno Mengibar. 2025. A-SEA3𝐋-QA: A Fully Automated Self-Evolving, Adversarial Workflow for Arabic Long-Context Question-Answer Generation. In Proceedings of The Third Arabic Natural Language Processing Conference, pages 107–116, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: A-SEA3𝐋-QA: A Fully Automated Self-Evolving, Adversarial Workflow for Arabic Long-Context Question-Answer Generation (Wang et al., ArabicNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.arabicnlp-main.9.pdf

PDF Cite Search Fix data