Ruiying Niu
2025
DualReward: A Dynamic Reinforcement Learning Framework for Cloze Tests Distractor Generation
Tianyou Huang | Xinglu Chen | Jingshen Zhang | Xin Ying Qiu | Ruiying Niu
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Tianyou Huang | Xinglu Chen | Jingshen Zhang | Xin Ying Qiu | Ruiying Niu
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
"This paper introduces DualReward, a novel reinforcement learning framework for automatic dis-tractor generation in cloze tests. Unlike conventional approaches that rely primarily on super-vised learning or static generative models, our method employs a dual reward structure with adaptive scaling that differentiates between human-created gold standard distractors and model-generated candidates. The framework dynamically adjusts reward signal intensity based on model performance and confidence. We evaluate our approach on both passage-level (CLOTH-F) and sentence-level (MCQ) cloze test datasets, demonstrating consistent improvements overstate-of-the-art baselines. Experimental results show that our adaptive reward scaling mechanism provides modest but consistent benefits on homogeneous datasets (CLOTH-F) and more substantial improvements (3.48-3.86% in P@1) on diverse, cross-domain data (MCQ), suggest-ing its particular effectiveness for handling varied question types and domains. Our work offers a flexible framework that effectively balances learning from reliable human examples while exploring novel, high-quality distractors for automated test generation."