Chen Qiguang
Also published as: 麒光 陈
2024
SpanCS:面向跨语言代码生成的片段级语码转换(SpanCS: Span-Level Code-Switching for Cross-Lingual Code Generation)
Zhu Qingfu (朱庆福)
|
Zhou Shiqi (周士祺)
|
Wang Shuo (王硕)
|
Zhang Zhiming (张致铭)
|
Wang Haoyu (王昊钰)
|
Chen Qiguang (陈麒光)
|
Che Wanxiang (车万翔)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“跨语言代码生成旨在将英语到代码的生成能力迁移至其他自然语言。翻译-训 练(Translate-Train)和语码转换(Code-Switching)是实现跨语言迁移的两类经典数据增广方法,两者优势互补但尚未有效结合。为此,本文提出了一种面向跨语言代码生成的片段级语码转换(SpanCS)方法。首先,该方法利用语码转换框架关联源语言上下文与目标语言片段,以促进多种语言的交互和对齐。其次,该方法利用翻译-训练方法从完整的源语言翻译中提取目标语言片段,以保证增广数据与原始数据间的语义一致性。为了公平地评价多种自然语言之间代码生成的性能差异,本文通过人工翻译与校验,基于HumanEval构建了包含10种自然语言的多语言代码生成评测基MHumanEval。该基准上的三个主干模型的实验结果表明,SpanCS在跨语言代码生成任务上一致优于前人的数据增广方法。”
2023
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Zhuang Ziyu
|
Chen Qiguang
|
Ma Longxuan
|
Li Mingda
|
Han Yi
|
Qian Yushan
|
Bai Haopeng
|
Zhang Weinan
|
Ting Liu
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)
“From pre-trained language model (PLM) to large language model (LLM), the field of naturallanguage processing (NLP) has witnessed steep performance gains and wide practical uses. Theevaluation of a research field guides its direction of improvement. However, LLMs are extremelyhard to thoroughly evaluate for two reasons. First of all, traditional NLP tasks become inade-quate due to the excellent performance of LLM. Secondly, existing evaluation tasks are difficultto keep up with the wide range of applications in real-world scenarios. To tackle these problems,existing works proposed various benchmarks to better evaluate LLMs. To clarify the numerousevaluation tasks in both academia and industry, we investigate multiple papers concerning LLMevaluations. We summarize 4 core competencies of LLM, including reasoning, knowledge, relia-bility, and safety. For every competency, we introduce its definition, corresponding benchmarks,and metrics. Under this competency architecture, similar tasks are combined to reflect corre-sponding ability, while new tasks can also be easily added into the system. Finally, we give oursuggestions on the future direction of LLM’s evaluation.”
Search
Fix data
Co-authors
- Bai Haopeng 1
- Wang Haoyu (王昊钰) 1
- Ting Liu 1
- Ma Longxuan 1
- Li Mingda 1
- show all...
Venues
- ccl2