Weilu Xu
2025
基于强化学习的大语言模型古文释义选择研究
Weilu Xu | Shujian Huang
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Weilu Xu | Shujian Huang
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
"古文释义选择任务对语言模型的语义理解与语境匹配能力提出了较高挑战。本文提出一种基于强化学习的训练框架,通过结果导向的奖励设计,引导大语言模型优化古文释义判断策略。实验表明,相比监督微调(Supervised Fine-tuning, SFT),强化学习方法在准确率指标上表现更优。进一步分析发现,强化学习仅在释义选择任务上的训练不仅提升了模型的古文翻译能力,还在古汉语通用能力评估基准(ACLUE)上展现出更优的跨任务迁移性。相较之下,SFT训练后的模型在翻译与其他古文任务中的表现出现明显下降。本研究为古文处理任务提供了新的训练范式,验证了强化学习在非推理类语言任务中的有效性与泛化潜力。"
LLM’s Weakness in NER Doesn’t Stop It from Enhancing a Stronger SLM
Weilu Xu | Renfei Dang | Shujian Huang
Proceedings of the Second Workshop on Ancient Language Processing
Weilu Xu | Renfei Dang | Shujian Huang
Proceedings of the Second Workshop on Ancient Language Processing
Large Language Models (LLMs) demonstrate strong semantic understanding ability and extensive knowledge, but struggle with Named Entity Recognition (NER) due to hallucination and high training costs. Meanwhile, supervised Small Language Models (SLMs) efficiently provide structured predictions but lack adaptability to unseen entities and complex contexts. In this study, we investigate how a relatively weaker LLM can effectively support a supervised model in NER tasks. We first improve the LLM using LoRA-based fine-tuning and similarity-based prompting, achieving performance comparable to a SLM baseline. To further improve results, we propose a fusion strategy that integrates both models: prioritising SLM’s predictions while using LLM guidance in low confidence cases. Our hybrid approach outperforms both baselines on three classic Chinese NER datasets.