Erhong Yang - ACL Anthology

Erhong Yang

Also published as: 尔弘杨, Erhong YANG

2025

BLCU-ICALL at BEA 2025 Shared Task: Multi-Strategy Evaluation of AI Tutors
Jiyuan An | Xiang Fu | Bo Liu | Xuquan Zong | Cunliang Kong | Shuliang Liu | Shuo Wang | Zhenghao Liu | Liner Yang | Hanghang Fan | Erhong Yang
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

This paper describes our approaches for the BEA-2025 Shared Task on assessing pedagogical ability and attributing tutor identities in AI-powered tutoring systems. We explored three methodological paradigms: in-context learning (ICL), supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF). Results indicate clear methodological strengths: SFT is highly effective for structured classification tasks such as mistake identification and feedback actionability, while ICL with advanced prompting excels at open-ended tasks involving mistake localization and instructional guidance. Additionally, fine-tuned models demonstrated strong performance in identifying tutor authorship. Our findings highlight the importance of aligning methodological strategy and task structure, providing insights toward more effective evaluations of educational AI systems.

2024

面向语言学习者的跨语言反馈评语生成方法(Cross-Lingual Feedback Comment Generation for Language Learners)
Jiyuan An (安纪元) | Lin Zhu (朱琳) | Erhong Yang (杨尔弘)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“反馈评语生成任务旨在为语言学习者的产出提供纠偏及解释性的评价,促进学习者写作能力的发展。现有研究主要聚焦于单语的反馈评语生成,如为英语学习者提供英文反馈评语,但这忽略了非母语学习者可能面临的理解障碍问题,尤其当评语中存在陌生的语言知识时。因此,本文提出跨语言反馈评语生成任务(CLFCG),目的是为语言学习者生成母语的反馈评语。本研究构建了首个英甭中跨语言反馈评语生成数据集,该数据集包含英语学习者产出的语句与相应的中文反馈评语,并探索了基于流水线的预训练语言模型引导增强生成方法,将修正编辑、线索词语和语法术语等作为输入的附加信息,引导和提示生成模型。实验结果表明,附加引导信息的预训练语言模型流水线方法在自动评估(BLEU:50.32)与人工评估(Precision:62.84)上表现良好。本文对实验结果进行了深入分析,以期为跨语言反馈评语生成任务提供更多见解。”

Automatic Construction of the English Sentence Pattern Structure Treebank for Chinese ESL learners
Lin Zhu | Meng Xu | Wenya Guo | Jingsi Yu | Liner Yang | Zehuang Cao | Yuan Huang | Erhong Yang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“Analyzing long and complicated sentences has always been a priority and challenge in Englishlearning. In order to conduct the parse of these sentences for Chinese English as Second Lan-guage (ESL) learners, we design the English Sentence Pattern Structure (ESPS) based on theSentence Diagramming theory. Then, we automatically construct the English Sentence PatternStructure Treebank (ESPST) through the method of rule conversion based on constituency struc-ture and evaluate the conversion results. In addition, we set up two comparative experiments,using trained parser and large language models (LLMs). The results prove that the rule-basedconversion approach is effective.”

Cost-efficient Crowdsourcing for Span-based Sequence Labeling:Worker Selection and Data Augmentation
Yujie Wang | Chao Huang | Liner Yang | Zhixuan Fang | Yaping Huang | Yang Liu | Jingsi Yu | Erhong Yang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“This paper introduces a novel crowdsourcing worker selection algorithm, enhancing annotationquality and reducing costs. Unlike previous studies targeting simpler tasks, this study con-tends with the complexities of label interdependencies in sequence labeling. The proposedalgorithm utilizes a Combinatorial Multi-Armed Bandit (CMAB) approach for worker selec-tion, and a cost-effective human feedback mechanism. The challenge of dealing with imbal-anced and small-scale datasets, which hinders offline simulation of worker selection, is tack-led using an innovative data augmentation method termed shifting, expanding, and shrink-ing (SES). Rigorous testing on CoNLL 2003 NER and Chinese OEI datasets showcased thealgorithm’s efficiency, with an increase in F1 score up to 100.04% of the expert-only base-line, alongside cost savings up to 65.97%. The paper also encompasses a dataset-independenttest emulating annotation evaluation through a Bernoulli distribution, which still led to animpressive 97.56% F1 score of the expert baseline and 59.88% cost savings. Furthermore,our approach can be seamlessly integrated into Reinforcement Learning from Human Feed-back (RLHF) systems, offering a cost-effective solution for obtaining human feedback. All re-sources, including source code and datasets, are available to the broader research community athttps://github.com/blcuicall/nlp-crowdsourcing.”

MCTS: A Multi-Reference Chinese Text Simplification Dataset
Ruining Chong | Luming Lu | Liner Yang | Jinran Nie | Zhenghao Liu | Shuo Wang | Shuhan Zhou | Yaoxin Li | Erhong Yang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Text simplification aims to make the text easier to understand by applying rewriting transformations. There has been very little research on Chinese text simplification for a long time. The lack of generic evaluation data is an essential reason for this phenomenon. In this paper, we introduce MCTS, a multi-reference Chinese text simplification dataset. We describe the annotation process of the dataset and provide a detailed analysis. Furthermore, we evaluate the performance of several unsupervised methods and advanced large language models. We additionally provide Chinese text simplification parallel data that can be used for training, acquired by utilizing machine translation and English text simplification. We hope to build a basic understanding of Chinese text simplification through the foundational work and provide references for future research. All of the code and data are released at https://github.com/blcuicall/mcts/.

2023

Leveraging Prefix Transfer for Multi-Intent Text Revision
Ruining Chong | Cunliang Kong | Liu Wu | Zhenghao Liu | Ziye Jin | Liner Yang | Yange Fan | Hanghang Fan | Erhong Yang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Text revision is a necessary process to improve text quality. During this process, writers constantly edit texts out of different edit intentions. Identifying edit intention for a raw text is always an ambiguous work, and most previous work on revision systems mainly focuses on editing texts according to one specific edit intention. In this work, we aim to build a multi-intent text revision system that could revise texts without explicit intent annotation. Our system is based on prefix-tuning, which first gets prefixes for every edit intent, and then trains a prefix transfer module, enabling the system to selectively leverage the knowledge from various prefixes according to the input text. We conduct experiments on the IteraTeR dataset, and the results show that our system outperforms baselines. The system can significantly improve the SARI score with more than 3% improvements, which thrives on the learned editing intention prefixes.

2022

Multitasking Framework for Unsupervised Simple Definition Generation
Cunliang Kong | Yun Chen | Hengyuan Zhang | Liner Yang | Erhong Yang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner’s dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.

汉语增强依存句法自动转换研究(Transformation of Enhanced Dependencies in Chinese)
Jingsi Yu (余婧思) | Shi Jialu (师佳璐) | Liner Yang (杨麟儿) | Dan Xiao (肖丹) | Erhong Yang (杨尔弘)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“自动句法分析是自然语言处理中的一项核心任务,受限于依存句法中每个节点只能有一条入弧的规则,基础依存句法中许多实词之间的关系无法用依存弧和依存标签直接标明;同时,已有的依存句法体系中的依存关系还有进一步细化、提升的空间,以便从中提取连贯的语义关系。面对这种情况,本文在斯坦福基础依存句法规范的基础上,研制了汉语增强依存句法规范,主要贡献在于:介词和连词的增强、并列项的传播、句式转换和特殊句式的增强。此外,本文提供了基于Python的汉语增强依存句法转换的转换器,以及一个基于Web的演示,该演示将句子从基础依存句法树通过本文的规范解析成依存图。最后,本文探索了增强依存句法的实际应用,并以搭配抽取和信息抽取为例进行相关讨论。”

句式结构树库的自动构建研究(Automatic Construction of Sentence Pattern Structure Treebank)
Chenhui Xie (谢晨晖) | Zhengsheng Hu (胡正升) | Liner Yang (杨麟儿) | Tianxin Liao (廖田昕) | Erhong Yang (杨尔弘)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“句式结构树库是以句本位语法为理论基础构建的句法资源,对汉语教学以及句式结构自动句法分析等研究具有重要意义。目前已有的句式结构树库语料主要来源于教材领域,其他领域的标注数据较为缺乏,如何高效地扩充高质量的句法树库是值得研究的问题。人工标注句法树库费时费力,并且树库质量也难以保证,为此,本文尝试通过规则的方法,将宾州中文树库(ctb)转换为句式结构树库,从而扩大现有句式结构树库的规模。实验结果表明,本文提出的基于树库转换规则的方法是有效的。”

COMPILING: A Benchmark Dataset for Chinese Complexity Controllable Definition Generation
Jiaxin Yuan | Cunliang Kong | Chenhui Xie | Liner Yang | Erhong Yang
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“The definition generation task aims to generate a word’s definition within a specific context automatically. However, owing to the lack of datasets for different complexities, the definitions produced by models tend to keep the same complexity level. This paper proposes a novel task of generating definitions for a word with controllable complexity levels. Correspondingly, we introduce COMPILING, a dataset given detailed information about Chinese definitions, and each definition is labeled with its complexity levels. The COMPILING dataset includes 74,303 words and 106,882 definitions. To the best of our knowledge, it is the largest dataset of the Chinese definition generation task. We select various representative generation methods as baselines for this task and conduct evaluations, which illustrates that our dataset plays an outstanding role in assisting models in generating different complexity-level definitions. We believe that the COMPILING dataset will benefit further research in complexity controllable definition generation.”

CTAP for Chinese:A Linguistic Complexity Feature Automatic Calculation Platform
Yue Cui | Junhui Zhu | Liner Yang | Xuezhi Fang | Xiaobin Chen | Yujie Wang | Erhong Yang
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The construct of linguistic complexity has been widely used in language learning research. Several text analysis tools have been created to automatically analyze linguistic complexity. However, the indexes supported by several existing Chinese text analysis tools are limited and different because of different research purposes. CTAP is an open-source linguistic complexity measurement extraction tool, which prompts any research purposes. Although it was originally developed for English, the Unstructured Information Management (UIMA) framework it used allows the integration of other languages. In this study, we integrated the Chinese component into CTAP, describing the index sets it incorporated and comparing it with three linguistic complexity tools for Chinese. The index set includes four levels of 196 linguistic complexity indexes: character level, word level, sentence level, and discourse level. So far, CTAP has implemented automatic calculation of complexity characteristics for four languages, aiming to help linguists without NLP background study language complexity.

BLCU-ICALL at SemEval-2022 Task 1: Cross-Attention Multitasking Framework for Definition Modeling
Cunliang Kong | Yujie Wang | Ruining Chong | Liner Yang | Hengyuan Zhang | Erhong Yang | Yaping Huang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes the BLCU-ICALL system used in the SemEval-2022 Task 1 Comparing Dictionaries and Word Embeddings, the Definition Modeling subtrack, achieving 1st on Italian, 2nd on Spanish and Russian, and 3rd on English and French. We propose a transformer-based multitasking framework to explore the task. The framework integrates multiple embedding architectures through the cross-attention mechanism, and captures the structure of glosses through a masking language model objective. Additionally, we also investigate a simple but effective model ensembling strategy to further improve the robustness. The evaluation results show the effectiveness of our solution. We release our code at: https://github.com/blcuicall/SemEval2022-Task1-DM.

2021

中美学者学术英语写作中词汇难度特征比较研究——以计算语言学领域论文为例(A Comparative Study of the Features of Lexical Sophistication in Academic English Writing by Chinese and American)
Yonghui Xie (谢永慧) | Yang Liu (刘洋) | Erhong Yang (杨尔弘) | Liner Yang (杨麟儿)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

学术英语写作在国际学术交流中的作用日益凸显,然而对于英语非母语者,学术英语写作是困难的,为此本文对计算语言领域中美学者学术英语写作中词汇难度特征做比较研究。自构建1132篇中美论文全文语料库,统计语料中484个词汇难度特征值。经过特征筛选与因子分析的降维处理得到表现较好的五个维度。最后计算中美学者论文的维度分从而比较差异,发现美国学者的论文相较中国学者的论文中词汇单位更具常用性、二元词串更具稳固性、三元词串更具稳固性、虚词更具复杂性、词类更具关联性。主要原因在于统计特征值时借助的外部资源库与美国学者的论文更贴近,且中国学者没有完全掌握该领域学术写作的习惯。因此,中国学者可充分利用英语本族语者构建的资源库,从而产出更为地道与流利的学术英语论文。

2020

面向汉语作为第二语言学习的个性化语法纠错(Personalizing Grammatical Error Correction for Chinese as a Second Language)
Shengsheng Zhang (张生盛) | Guina Pang (庞桂娜) | Liner Yang (杨麟儿) | Chencheng Wang (王辰成) | Yongping Du (杜永萍) | Erhong Yang (杨尔弘) | Yaping Huang (黄雅平)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

语法纠错任务旨在通过自然语言处理技术自动检测并纠正文本中的语序、拼写等语法错误。当前许多针对汉语的语法纠错方法已取得较好的效果,但往往忽略了学习者的个性化特征,如二语等级、母语背景等。因此,本文面向汉语作为第二语言的学习者,提出个性化语法纠错,对不同特征的学习者所犯的错误分别进行纠正,并构建了不同领域汉语学习者的数据集进行实验。实验结果表明,将语法纠错模型适应到学习者的各个领域后,性能得到明显提升。

基于BERT与柱搜索的中文释义生成(Chinese Definition Modeling Based on BERT and Beam Seach)
Qinan Fan (范齐楠) | Cunliang Kong (孔存良) | Liner Yang (杨麟儿) | Erhong Yang (杨尔弘)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

释义生成任务是指为一个目标词生成相应的释义。前人研究中文释义生成任务时未考虑目标词的上下文,本文首次在中文释义生成任务中使用了目标词的上下文信息,并提出了一个基于BERT与柱搜索的释义生成模型。本文构建了包含上下文的CWN中文数据集用于开展实验,除了BLEU指标之外,还使用语义相似度作为额外的自动评价指标,实验结果显示本文模型在中文CWN数据集和英文Oxford数据集上均有显著提升,人工评价结果也与自动评价结果一致。最后,本文对生成实例进行了深入分析。

汉语学习者依存句法树库构建(Construction of a Treebank of Learner Chinese)
Jialu Shi (师佳璐) | Xinyu Luo (罗昕宇) | Liner Yang (杨麟儿) | Dan Xiao (肖丹) | Zhengsheng Hu (胡正声) | Yijun Wang (王一君) | Jiaxin Yuan (袁佳欣) | Yu Jingsi (余婧思) | Erhong Yang (杨尔弘)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

汉语学习者依存句法树库为非母语者语料提供依存句法分析,可以支持第二语言教学与研究,也对面向第二语言的句法分析、语法改错等相关研究具有重要意义。然而,现有的汉语学习者依存句法树库数量较少,且在标注方面仍存在一些问题。为此,本文改进依存句法标注规范,搭建在线标注平台,并开展汉语学习者依存句法标注。本文重点介绍了数据选取、标注流程等问题,并对标注结果进行质量分析,探索二语偏误对标注质量与句法分析的影响。

Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications
Erhong YANG | Endong XUN | Baolin ZHANG | Gaoqi RAO
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

Overview of NLPTEA-2020 Shared Task for Chinese Grammatical Error Diagnosis
Gaoqi Rao | Erhong Yang | Baolin Zhang
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

This paper presents the NLPTEA 2020 shared task for Chinese Grammatical Error Diagnosis (CGED) which seeks to identify grammatical error types, their range of occurrence and recommended corrections within sentences written by learners of Chinese as a foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 30 teams registered for this shared task, 17 teams developed the system and submitted a total of 43 runs. System performances achieved a significant progress, reaching F1 of 91% in detection level, 40% in position level and 28% in correction level. All data sets with gold standards and scoring scripts are made publicly available to researchers.

2010

The Annotation of Event Schema in Chinese
Hongjian Zou | Erhong Yang | Yan Gao | Qingqing Zeng
Proceedings of the Eighth Workshop on Asian Language Resouces

2000

The Research of Word Sense Disambiguation Method Based on Co-occurrence Frequency of Hownet
Erhong Yang | Guoqing Zhang | Yongkui Zhang
Second Chinese Language Processing Workshop

Co-authors

Zhengsheng Hu 2

Hengyuan Zhang 2

Yang Liu (刘洋) 1

Chencheng Wang 1

Qingqing Zeng 1

Shengsheng Zhang 1

Guoqing Zhang 1

Yongkui Zhang 1

Venues