NanoFlux: Adversarial Dual-LLM Evaluation and Distillation for Multi-Domain Reasoning

Raviteja Anantha; Soheil Hor; Teodor Nicola Antoniu; Layne C Price

NanoFlux: Adversarial Dual-LLM Evaluation and Distillation for Multi-Domain Reasoning

Raviteja Anantha, Soheil Hor, Teodor Nicola Antoniu, Layne C Price

Abstract

We present NanoFlux, a novel adversarial framework for generating targeted training data to improve LLM reasoning, where adversarially-generated datasets of ≤ 200 examples outperform conventional fine-tuning approaches. The framework employs a competitive dynamic between models alternating as Attacker and Defender, supervised by a tool-augmented Judge, synthesizing multi-step questions with explanatory annotations. Fine-tuning a 4B-parameter model on NanoFlux-generated data yields performance gains across diverse domains compared to full-benchmark fine-tuning: +5.9% on mathematical reasoning, +3.6% on scientific reasoning, and +16.6% on medical reasoning, while reducing computational requirements by 3-14×. Ablation studies reveal a non-monotonic relationship between dataset characteristics and model performance, uncovering domain-specific optimal points for question complexity and reasoning quality. NanoFlux automates training data generation through embedding-based novelty filtering, tool-augmented evaluation, and multi-hop reasoning, pointing to the value of small, targeted training datasets.

Anthology ID:: 2026.gem-main.27
Volume:: Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Simon Mille, Sebastian Gehrmann, Patrícia Schmidtová, Ondřej Dušek, Marzieh Fadaee, Kyle Lo, Enrico Santus, Gabriel Stanovsky
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 253–270
Language:
URL:: https://aclanthology.org/2026.gem-main.27/
DOI:
Bibkey:
Cite (ACL):: Raviteja Anantha, Soheil Hor, Teodor Nicola Antoniu, and Layne C Price. 2026. NanoFlux: Adversarial Dual-LLM Evaluation and Distillation for Multi-Domain Reasoning. In Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM), pages 253–270, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: NanoFlux: Adversarial Dual-LLM Evaluation and Distillation for Multi-Domain Reasoning (Anantha et al., GEM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.gem-main.27.pdf

PDF Cite Search Fix data