Xiang Fu
2025
BLCU-ICALL at BEA 2025 Shared Task: Multi-Strategy Evaluation of AI Tutors
Jiyuan An | Xiang Fu | Bo Liu | Xuquan Zong | Cunliang Kong | Shuliang Liu | Shuo Wang | Zhenghao Liu | Liner Yang | Hanghang Fan | Erhong Yang
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Jiyuan An | Xiang Fu | Bo Liu | Xuquan Zong | Cunliang Kong | Shuliang Liu | Shuo Wang | Zhenghao Liu | Liner Yang | Hanghang Fan | Erhong Yang
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
This paper describes our approaches for the BEA-2025 Shared Task on assessing pedagogical ability and attributing tutor identities in AI-powered tutoring systems. We explored three methodological paradigms: in-context learning (ICL), supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF). Results indicate clear methodological strengths: SFT is highly effective for structured classification tasks such as mistake identification and feedback actionability, while ICL with advanced prompting excels at open-ended tasks involving mistake localization and instructional guidance. Additionally, fine-tuned models demonstrated strong performance in identifying tutor authorship. Our findings highlight the importance of aligning methodological strategy and task structure, providing insights toward more effective evaluations of educational AI systems.
CCL25-Eval 任务6系统报告:基于数据增强及大小模型协同的中小学作文修辞识别
Xuquan Zong | Jiyuan An | Xiang Fu | Luming Lu | Haonan Zhu | Liner Yang | Erhong Yang
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Xuquan Zong | Jiyuan An | Xiang Fu | Luming Lu | Haonan Zhu | Liner Yang | Erhong Yang
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
"CCL25-Eval任务6提出了一个段落级、多层次,细粒度中小学修辞识别与理解任务。针对修辞分类任务的特点,本文构建了一种以数据增强为核心、结合高效监督微调的多策略融合框架,并融合语句层面修辞识别与段落句间关系建模及识别,以全面提升模型的修辞理解能力。针对修辞成分抽取任务的特点,本文采用先进行修辞类别判定,后在该基础上进行修辞相关实体识别的两阶段处理策略,有效提升了整体识别精度。结果表明,本文所提出的方法能够有效对修辞进行识别和抽取,三个赛道上的分数分别达到了43.47、51.71、38.27,总成绩位列第二。"