2024
pdf
bib
abs
SpanCS:面向跨语言代码生成的片段级语码转换(SpanCS: Span-Level Code-Switching for Cross-Lingual Code Generation)
Zhu Qingfu (朱庆福)
|
Zhou Shiqi (周士祺)
|
Wang Shuo (王硕)
|
Zhang Zhiming (张致铭)
|
Wang Haoyu (王昊钰)
|
Chen Qiguang (陈麒光)
|
Che Wanxiang (车万翔)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“跨语言代码生成旨在将英语到代码的生成能力迁移至其他自然语言。翻译-训 练(Translate-Train)和语码转换(Code-Switching)是实现跨语言迁移的两类经典数据增广方法,两者优势互补但尚未有效结合。为此,本文提出了一种面向跨语言代码生成的片段级语码转换(SpanCS)方法。首先,该方法利用语码转换框架关联源语言上下文与目标语言片段,以促进多种语言的交互和对齐。其次,该方法利用翻译-训练方法从完整的源语言翻译中提取目标语言片段,以保证增广数据与原始数据间的语义一致性。为了公平地评价多种自然语言之间代码生成的性能差异,本文通过人工翻译与校验,基于HumanEval构建了包含10种自然语言的多语言代码生成评测基MHumanEval。该基准上的三个主干模型的实验结果表明,SpanCS在跨语言代码生成任务上一致优于前人的数据增广方法。”
pdf
bib
abs
Self-Guide:一种基于自我规划的大语言模型推理增强方法(Self-Guide: Enhancing LLM Reasoning Ability via Self-Plan)
Liu Yibin (刘艺彬)
|
Liu Zhenghao (刘正皓)
|
Yan Yukun (闫宇坤)
|
Yu Shi (于是)
|
Wang Shuo (王硕)
|
Yang Liner (麟儿 杨)
|
Chen Huimin (陈慧敏)
|
Gu Yu (谷峪)
|
Yu Ge (于戈)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“尽管大语言模型在自然语言处理任务中取得显著进展,但其在复杂问题推理等领域还面临着认知负荷问题,即大语言模型在推理过程需要记忆并处理大量信息。因此,如何有效地减少语言模型推理过程中的认知负荷,缓解推理过程中可能出现的认知过载是一个亟待解决的问题。对此本文提出了Self-Guide方法,用于增强语言模型的推理能力。该方法通过指引大语言模型生成常识知识和推理指导,让语言模型基于自我规划来增强其推理能力,并通过与推理链结合的方式对模型的推理过程进行校准。与现有方法不同的是,本文在不对大语言模型进行微调或使用外部工具的情况下,显著提升了语言模型的推理性能。实验结果表明,Self-Guide方法在四种常见推理任务上性能显著优于基线方法,同时相比传统的推理链模型,Self-Guide方法在推理能力较弱的模型上也具有良好的泛化性能。通过结合大语言模型的自我规划和推理能力,Self-Guide方法为提升语言模型的推理能力提供了一种新的有效途径。”
pdf
bib
abs
Enhancing Free-Form Table Question Answering Models by Distilling Relevant-Cell-Based Rationales
Yang Zhiyu
|
Wang Shuo
|
Yan Yukun
|
Liu Pengyuan
|
Yu Dong
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“Free-form table question answering is a challenging task since tables contain structured contentscompared to plain texts, which requires high-level reasoning abilities to effectively identify cellsthat are relevant to the question and produce a correct and faithful answer based on their relations.Large language models (LLMs) have exhibited remarkable reasoning capabilities in numerousNLP applications. However, in some specific tasks, specially-trained small models can still out-perform LLMs. Furthermore, small models require extremely less computation costs comparedto LLMs. To leverage the strengths of both types of models, we propose a Relevant-Cell-basedKnowledge Distillation with inference-time Teacher Guidance (RCKD-TG) method. This ap-proach aims to combine small free-form table question answering models’ abilities to learn fromhuman annotations and large language models’ abilities to effectively reason from table contents,via applying Relevant-Cell-based rationales distilled from LLMs to small models’ training andinference stages. Our experiments demonstrate the superiority of our method over vanilla smallmodels in correctness, faithfulness, adequacy and fluency, also over general LLMs in adheringto the style of human annotations. We achieve state-of-the-art performance on FeTaQA, a rep-resentative free-form table question answering benchmark. Our result of a 41.3 BLEU scoredemonstrates the feasibility of effectively using small models’ task-specific abilities and LLMs’reasoning capabilities at the same time. Additionally, our method exhibits high computation ef-ficiency and data efficiency. Compared to strong baselines, we achieve better performance withsignificantly less training data.”