Hongying Zan (昝红英) - ACL Anthology

Hongying Zan

Also published as: Hong-ying Zan, 红英昝

2025

融合MOE的多任务学习文档级企业新闻事件抽取
Aoze Zheng | Kunli Zhang | Ying Wang | Songrui Yuan | Yihao Tian | Hongying Zan
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)

"企业新闻事件抽取是支撑企业动态分析与产业决策的关键技术。企业新闻事件抽取具有文本篇幅较长,内容多元化的特点,面临多事件抽取和论元分散等核心挑战。大语言模型(Large Language Model,LLM)虽然具有强大的长距离依赖建模和语义关联能力,但通用大语言模型难以满足企业级应用对专业性与资源效率的需求。本文提出了融合MoE的多任务学习企业新闻事件抽取模型(MoE-Enhanced Multi-Task Learning for Corporate News Event Extraction,MoE-ML-CNEE)。通过构建统一微调数据集与多任务联合训练范式,将事件检测与论元抽取构建为结构化语言模板,增强模型全局建模能力。设计MoELoRA模块,利用动态路由机制实现多专家网络在低秩空间的知识共享与特征解耦,进一步提升模型事件抽取性能。实验表明,MoE-ML-CNEE模型在ChiFinAnn和DuEE-fin公共数据集和自建企业新闻数据集的事件检测、事件论元抽取结果均优于现有基线模型。"

SILC-EFSA: Self-aware In-context Learning Correction for Entity-level Financial Sentiment Analysis
Senbin Zhu | ChenYuan He | Hongde Liu | Pengcheng Dong | Hanjie Zhao | Yuchen Yan | Yuxiang Jia | Hongying Zan | Min Peng
Proceedings of the 31st International Conference on Computational Linguistics

In recent years, fine-grained sentiment analysis in finance has gained significant attention, but the scarcity of entity-level datasets remains a key challenge. To address this, we have constructed the largest English and Chinese financial entity-level sentiment analysis datasets to date. Building on this foundation, we propose a novel two-stage sentiment analysis approach called Self-aware In-context Learning Correction (SILC). The first stage involves fine-tuning a base large language model to generate pseudo-labeled data specific to our task. In the second stage, we train a correction model using a GNN-based example retriever, which is informed by the pseudo-labeled data. This two-stage strategy has allowed us to achieve state-of-the-art performance on the newly constructed datasets, advancing the field of financial sentiment analysis. In a case study, we demonstrate the enhanced practical utility of our data and methods in monitoring the cryptocurrency market. Our datasets and code are available at https://github.com/NLP-Bin/SILC-EFSA.

JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling
Jinwang Song | Hongying Zan | Kunli Zhang | Lingling Mu | Yingjie Han | Haobo Hua | Min Peng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Text-to-SQL, which maps natural language to SQL queries, has benefited greatly from recent advances in Large Language Models (LLMs). While LLMs offer various paradigms for this task, including prompting and supervised fine-tuning (SFT), SFT approaches still face challenges such as complex multi-stage pipelines and poor robustness to noisy schema information. To address these limitations, we present JOLT-SQL, a streamlined single-stage SFT framework that jointly optimizes schema linking and SQL generation via a unified loss. JOLT-SQL employs discriminative schema linking, enhanced by local bidirectional attention, alongside a confusion-aware noisy schema sampling strategy with selective attention to improve robustness under noisy schema conditions. Experiments on the Spider and BIRD benchmarks demonstrate that JOLT-SQL achieves state-of-the-art execution accuracy among comparable-size open-source models, while significantly improving both training and inference efficiency.

CCL25-Eval任务11系统报告:基于大模型微调的汉字硬笔书写质量自动评价
KongLulu KongLulu | Hongying Zan | Jinwang Song | LiuHaixin LiuHaixin | Yifan Li | Zhewei Luo
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)

本技术报告探讨了通过微调本地视觉语言模型,实现汉字硬笔书写质量自动评价的技术方案。针对传统评价方法难以提供准确性反馈的问题,我们团队采用精心设计的prompt并结合微调的方式构建了一个高效的汉字硬笔书写质量自动评价系统。我们采用Qwen2.5-VL-7B-Instruct模型作为基础,通过LoRA微调技术实现了汉字书写质量等级分类(子任务一)和个性化评语生成(子任务二)的功能。系统地融合了视觉特征分析与语言生成能力,在训练过程中采用了梯度检查点、BF16混合精度训练等技术优化显存使用,并设计了针对性的损失函数和评估指标。实验结果表明,我们的方法能够有效实现汉字书写质量的细粒度评价。

DialogueMMT: Dialogue Scenes Understanding Enhanced Multi-modal Multi-task Tuning for Emotion Recognition in Conversations
ChenYuan He | Senbin Zhu | Hongde Liu | Fei Gao | Yuxiang Jia | Hongying Zan | Min Peng
Proceedings of the 31st International Conference on Computational Linguistics

Emotion recognition in conversations (ERC) has garnered significant attention from the research community. However, due to the complexity of visual scenes and dialogue contextual dependencies in conversations, previous ERC methods fail to handle emotional cues from both visual sources and discourse structures. Furthermore, existing state-of-the-art ERC models are trained and tested separately on each single ERC dataset, not verifying their effectiveness across multiple datasets simultaneously. To address these challenges, this paper proposes an innovative framework for ERC, called Dialogue Scenes Understanding Enhanced Multi-modal Multi-task Tuning (DialogueMMT). More concretely, a novel video-language connector is applied within the large vision-language model for capturing video features effectively. Additionally, we utilize multi-task instruction tuning with a unified ERC dataset to enhance the model’s understanding of multi-modal dialogue scenes and employ a chain-of-thought strategy to improve emotion classification performance. Extensive experimental results on three benchmark ERC datasets indicate that the proposed DialogueMMT framework consistently outperforms existing state-of-the-art approaches in terms of overall performance.

GenWebNovel: A Genre-oriented Corpus of Entities in Chinese Web Novels
Hanjie Zhao | Yuchen Yan | Senbin Zhu | Hongde Liu | Yuxiang Jia | Hongying Zan | Min Peng
Proceedings of the 31st International Conference on Computational Linguistics

Entities are important to understanding literary works, which emphasize characters, plots and environment. The research on entity recognition, especially nested entity recognition in the literary domain is still insufficient partly due to insufficient annotated data. To address this issue, we construct the first Genre-oriented Corpus for Entity Recognition in Chinese Web Novels, namely GenWebNovel, comprising 400 chapters totaling 1,214,283 tokens under two genres, XuanHuan (Eastern Fantasy) and History. Based on the corpus, we analyze the distribution of different types of entities, including person, location, and organization. We also compare the nesting patterns of nested entities between GenWebNovel and the English corpus LitBank. Even though both belong to the literary domain, entities in different genres share few overlaps, making genre adaptation of NER (Named Entity Recognition) a hard problem. We propose a novel method that utilizes a pre-trained language model as an In-context learning example retriever to boost the performance of large language models. Our experiments show that this approach significantly enhances entity recognition, matching state-of-the-art (SOTA) models without requiring additional training data. Our code, dataset, and model are available at https://github.com/hjzhao73/GenWebNovel.

面向法律事件检测的大模型协同主动学习框架
Tingting Cui | Hongying Zan | Xinmeng Ji | Jinwang Song | Kunli Zhang | Yuxiang Jia
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)

"法律事件检测任务旨在识别并分类法律文本中的事件。然而,复杂的法律案件使得收集高质量标注数据面临巨大挑战。目前领域数据标注主要依赖人工,成本高昂且耗时。尽管传统的主动学习能够减少部分标注需求,但仍依赖于人工干预。大模型的发展为自动化数据标注带来了可能性,但如何确保标注的可靠性仍是亟待解决的问题。为此,本文提出了创新的协作训练范式,使用主动学习迭代选择训练数据,并利用大模型生成高质量标注,使用评估筛选机制保留高质量标注,大幅减少了人工标注的工作量。在两个事件检测基准数据集上的实验表明,该方法在低资源场景下显著降低了人工标注需求,在部分情况下可以接近监督学习的性能。"

BiasFilter: An Inference-Time Debiasing Framework for Large Language Models
Xiaoqing Cheng | Ruizhe Chen | Hongying Zan | Yuxiang Jia | Min Peng
Findings of the Association for Computational Linguistics: EMNLP 2025

Mitigating social bias in large language models (LLMs) has become an increasingly important research objective. However, existing debiasing methods often incur high human and computational costs, exhibit limited effectiveness, and struggle to scale to larger models and open-ended generation tasks. To address these limitations, this paper proposes BiasFilter, a model-agnostic, inference-time debiasing framework that integrates seamlessly with both open-source and API-based LLMs. Instead of relying on retraining with balanced data or modifying model parameters, BiasFilter enforces fairness by filtering generation outputs in real time. Specifically, it periodically evaluates intermediate outputs every few tokens, maintains an active set of candidate continuations, and incrementally completes generation by discarding low-reward segments based on a fairness reward signal. To support this process, we construct a fairness preference dataset and train an implicit reward model to assess token-level fairness in generated responses. Extensive experiments demonstrate that BiasFilter effectively mitigates social bias across a range of LLMs while preserving overall generation quality.

CaDRL: Document-level Relation Extraction via Context-aware Differentiable Rule Learning
Kunli Zhang | Pengcheng Wu | Bohan Yu | Kejun Wu | Aoze Zheng | Xiyang Huang | Chenkang Zhu | Min Peng | Hongying Zan | Yu Song
Proceedings of the 31st International Conference on Computational Linguistics

Document-level Relation Extraction (DocRE) aims to extract relations from documents. Compared with sentence-level relation extraction, it is necessary to extract long-distance dependencies. Existing methods enhance the output of trained DocRE models either by learning logical rules or by extracting rules from annotated data and then injecting them into the model. However, these approaches can result in suboptimal performance due to incorrect rule set constraints. To mitigate this issue, we propose Context-aware differentiable rule learning or CaDRL for short, a novel differentiable rule-based framework that learns the doc-specific logical rule to avoid generating suboptimal constraints. Specifically, we utilize Transformer-based relation attention to encode document and relation information, thereby learning the contextual information of the relation. We employ a sequence-generated differentiable rule decoder to generate relational probabilistic logic rules at each reasoning step. We also introduce a parameter sharing training mechanism in CaDRL to reconcile the DocRE model and the rule learning module. Extensive experimental results on three DocRE datasets demonstrate that CaDRL outperforms existing rule-based frameworks, significantly improving DocRE performance and making predictions more interpretable and logical.

Task-aware Contrastive Mixture of Experts for Quadruple Extraction in Conversations with Code-like Replies and Non-opinion Detection
Chenyuan He | Yuxiang Jia | Fei Gao | Senbin Zhu | Hongde Liu | Hongying Zan | Min Peng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

This paper focuses on Dialogue Aspect-based Sentiment Quadruple (DiaASQ) analysis, aiming to extract structured quadruples from multi-turn conversations. Applying Large Language Models (LLMs) for this specific task presents two primary challenges: the accurate extraction of multiple elements and the understanding of complex dialogue reply structure. To tackle these issues, we propose a novel LLM-based multi-task approach, named Task-aware Contrastive Mixture of Experts (TaCoMoE), to tackle the DiaASQ task by integrating expert-level contrastive loss within task-oriented mixture of experts layer. TaCoMoE minimizes the distance between the representations of the same expert in the semantic space while maximizing the distance between the representations of different experts to efficiently learn representations of different task samples. Additionally, we design a Graph-Centric Dialogue Structuring strategy for representing dialogue reply structure and perform non-opinion utterances detection to enhance the performance of quadruple extraction. Extensive experiments are conducted on the DiaASQ dataset, demonstrating that our method significantly outperforms existing parameter-efficient fine-tuning techniques in terms of both accuracy and computational efficiency. The code is available at https://github.com/he2720/TaCoMoE.

CCL25-Eval任务1系统报告:使用思维链和投票集成增强大型语言模型空间语义理解
LiuHaixin LiuHaixin | Hongying Zan | Jinwang Song | Yifan Li | KongLulu KongLulu
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)

"本技术报告详细介绍了我们团队在第五届空间语义理解评测(SpaCE2025)中的方法与成果。SpaCE2025 继续聚焦大语言模型在空间语义理解方面的能力评估,涵盖空间语言理解与空间推理两个核心维度,共设置五个子任务:空间信息正误判断、空间参照实体判断、空间异形同义判断、中文空间方位关系推理以及英文空间方位关系推理。我们通过设计结构化提示词并引入思维链推理机制,结合LoRA 微调技术和投票集成方法,有效提升了大语言模型在空间语义理解任务中的表现。在最终评测中,我们团队五个子任务的综合准确率为0.5983,整体排名第五。"

2024

Dual-teacher Knowledge Distillation for Low-frequency Word Translation
Yifan Guo | Hongying Zan | Hongfei Xu
Findings of the Association for Computational Linguistics: EMNLP 2024

Neural Machine Translation (NMT) models are trained on parallel corpora with unbalanced word frequency distribution. As a result, NMT models are likely to prefer high-frequency words than low-frequency ones despite low-frequency word may carry the crucial semantic information, which may hamper the translation quality once they are neglected. The objective of this study is to enhance the translation of meaningful but low-frequency words. Our general idea is to optimize the translation of low-frequency words through knowledge distillation. Specifically, we employ a low-frequency teacher model that excels in translating low-frequency words to guide the learning of the student model. To remain the translation quality of high-frequency words, we further introduce a dual-teacher distillation framework, leveraging both the low-frequency and high-frequency teacher models to guide the student model’s training. Our single-teacher distillation method already achieves a +0.64 BLEU improvements over the state-of-the-art method on the WMT 16 English-to-German translation task on the low-frequency test set. While our dual-teacher framework leads to +0.87, +1.24, +0.47, +0.87 and +0.86 BLEU improvements on the IWSLT 14 German-to-English, WMT 16 English-to-German, WMT 15 English-to-Czech, WMT 14 English-to-French and WMT 18 Chinese-to-English tasks respectively compared to the baseline, while maintaining the translation performance of high-frequency words.

MRC-based Nested Medical NER with Co-prediction and Adaptive Pre-training
Xiaojing Du | Hanjie Zhao | Danyan Xing | Yuxiang Jia | Hongying Zan
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In medical information extraction, medical Named Entity Recognition (NER) is indispensable, playing a crucial role in developing medical knowledge graphs, enhancing medical question-answering systems, and analyzing electronic medical records. The challenge in medical NER arises from the complex nested structures and sophisticated medical terminologies, distinguishing it from its counterparts in traditional domains. In response to these complexities, we propose a medical NER model based on Machine Reading Comprehension (MRC), which uses a task-adaptive pre-training strategy to improve the model’s capability in the medical field. Meanwhile, our model introduces multiple word-pair embeddings and multi-granularity dilated convolution to enhance the model’s representation ability and uses a combined predictor of Biaffine and MLP to improve the model’s recognition performance. Experimental evaluations conducted on the CMeEE, a benchmark for Chinese nested medical NER, demonstrate that our proposed model outperforms the compared state-of-the-art (SOTA) models.

OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety
Chuang Liu | Linhao Yu | Jiaxuan Li | Renren Jin | Yufei Huang | Ling Shi | Junhui Zhang | Xinmeng Ji | Tingting Cui | Tao Liu | Jinwang Song | Hongying Zan | Sun Li | Deyi Xiong
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

The rapid development of Chinese large language models (LLMs) poses big challenges for efficient LLM evaluation. While current initiatives have introduced new benchmarks or evaluation platforms for assessing Chinese LLMs, many of these focus primarily on capabilities, usually overlooking potential alignment and safety issues. To address this gap, we introduce OpenEval, an evaluation testbed that benchmarks Chinese LLMs across capability, alignment and safety. For capability assessment, we include 12 benchmark datasets to evaluate Chinese LLMs from 4 sub-dimensions: NLP tasks, disciplinary knowledge, commonsense reasoning and mathematical reasoning. For alignment assessment, OpenEval contains 7 datasets that examines the bias, offensiveness and illegalness in the outputs yielded by Chinese LLMs. To evaluate safety, especially anticipated risks (e.g., power-seeking, self-awareness) of advanced LLMs, we include 6 datasets. In addition to these benchmarks, we have implemented a phased public evaluation and benchmark update strategy to ensure that OpenEval is in line with the development of Chinese LLMs or even able to provide cutting-edge benchmark datasets to guide the development of Chinese LLMs. In our first public evaluation, we have tested a range of Chinese LLMs, spanning from 7B to 72B parameters, including both open-source and proprietary models. Evaluation results indicate that while Chinese LLMs have shown impressive performance in certain tasks, more attention should be directed towards broader aspects such as commonsense reasoning, alignment, and safety.

Essay Rhetoric Recognition and Understanding Using Synthetic Data and Model Ensemble Enhanced Large Language Models
Jinwang Song | Hongying Zan | Kunli Zhang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“Natural language processing technology has been widely applied in the field of education. Essay writing serves as a crucial method for evaluating students’ language skills and logical thinking abilities. Rhetoric, an essential component of essay, is also a key reference for assessing writing quality. In the era of large language models (LLMs), applying LLMs to the tasks of automatic classification and extraction of rhetorical devices is of significant importance. In this paper, we fine-tune LLMs with specific instructions to adapt them for the tasks of recognizing and extracting rhetorical devices in essays. To further enhance the performance of LLMs, we experimented with multi-task fine-tuning and expanded the training dataset through synthetic data. Additionally, we explored a model ensemble approach based on label re-inference. Our method achieved a score of 66.29 in Task 6 of the CCL 2024 Eval, Chinese Essay Rhetoric Recognition and Understanding(CERRU), securing the first position.”

基于指令微调与数据增强的儿童故事常识推理与寓意理解研究
Bohan Yu (于博涵) | Yunlong Li (李云龙) | Tao Liu (刘涛) | Aoze Zheng (郑傲泽) | Kunli Zhang (张坤丽) | Hongying Zan (昝红英)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“尽管现有语言模型在自然语言处理任务上表现出色,但在深层次语义理解和常识推理方面仍有提升空间。本研究通过测试模型在儿童故事常识推理与寓意理解数据集(CRMUS)上的性能,探究如何增强模型在复杂任务中的能力。在本次任务的赛道二中,本研究使用多个7B以内的开源大模型(如Qwen、InternLM等)进行零样本推理,并选择表现最优的模型基于LoRA进行指令微调来提高其表现。除此之外,本研究还对数据集进行了分析与增强。研究结果显示,通过设计有效的指令格式和调整LoRA微调参数,模型在常识推理和寓意理解上的准确率显著提高。最终在本次任务的赛道二中取得第一名的成绩,该任务的评价指标Acc值为74.38,达到了较为先进的水准。”

ZZU-NLP at SIGHAN-2024 dimABSA Task: Aspect-Based Sentiment Analysis with Coarse-to-Fine In-context Learning
Senbin Zhu | Hanjie Zhao | Xingren Wang | Shanhong Liu | Yuxiang Jia | Hongying Zan
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)

The DimABSA task requires fine-grained sentiment intensity prediction for restaurant reviews, including scores for Valence and Arousal dimensions for each Aspect Term. In this study, we propose a Coarse-to-Fine In-context Learning (CFICL) method based on the Baichuan2-7B model for the DimABSA task in the SIGHAN 2024 workshop. Our method improves prediction accuracy through a two-stage optimization process. In the first stage, we use fixed in-context examples and prompt templates to enhance the model’s sentiment recognition capability and provide initial predictions for the test data. In the second stage, we encode the Opinion field using BERT and select the most similar training data as new in-context examples based on similarity. These examples include the Opinion field and its scores, as well as related opinion words and their average scores. By filtering for sentiment polarity, we ensure that the examples are consistent with the test data. Our method significantly improves prediction accuracy and consistency by effectively utilizing training data and optimizing in-context examples, as validated by experimental results.

基于知识蒸馏的低频词翻译优化策略(Knowledge Distillation-Based Optimization Strategy for Low-Frequency Word Translation in Neural Machine)
Yifan Guo (郭逸帆) | Hongying Zan (昝红英) | Ziyue Yan (阎子悦) | Hongfei Xu (许鸿飞)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“神经机器翻译通常需要大量的平行语料库才能达到良好的翻译效果。而在不同的平行语料库中,均存在词频分布不平衡的问题,这可能导致模型在学习过程中表现出不同的偏差。这些模型倾向于学习高频词汇,而忽略了低频词汇所携带的关键语义信息。忽略的这些低频词汇也包含重要的翻译信息,可能会对翻译质量产生不利影响。目前的方法通常是训练一个双语模型,然后根据频率为词汇分配不同的权重,通过增加低频词的权重来提高低频词的翻译效果。在本文中,我们的目标是提高那些有意义但频率相对较低的词汇的翻译效果。本文提出使用知识蒸馏的方法来提高低频词的翻译效果,训练在低频词上翻译效果更好的模型,将其作为教师模型指导学生模型学习低频词翻译。进而提出一个更加稳定的双教师蒸馏模型,进一步保证高频的性能,使得模型在多个任务上均获得了稳定的提升。本文的单教师蒸馏模型在英语→ 德语任务上相较于SOTA进一步取得了0.64的BLEU提升,双教师蒸馏模型在汉语→ 英语任务上相较于SOTA进一步取得了0.31的BLEU提升,在英语→ 德语、英语→ 捷克语和英语→法语的翻译任务上相较于基线低频词翻译效果,在保证高频词翻译效果不变化的前提下,分别取得了1.24、0.47、0.87的BLEU提升。”

基于动态提示学习和依存关系的生成式结构化情感分析模型(Dynamic Prompt Learning and Dependency Relation based Generative Structured Sentiment Analysis Model)
Yintao Jia (贾银涛) | Jiajia Cui (崔佳佳) | Lingling Mu (穆玲玲) | Hongying Zan (昝红英)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“结构化情感分析旨在从文本中抽取所有由情感持有者、目标事物、观点表示和情感极性构成的情感元组,是较为全面的细粒度情感分析任务。针对目前结构化情感分析方法错误传递,提示模版适应性不足和情感要素构成复杂的问题,本文提出了基于动态提示学习和依存关系的生成式结构化情感分析模型,根据不同的情感元组构成情况分别设计提示模版,并用模板增强生成式预训练模型的输入,用依存关系增强生成效果。实验结果显示,本文提出的模型在SemEval20221数据集上的SF1值优于所对比的基线模型。”

2023

A Corpus for Named Entity Recognition in Chinese Novels with Multi-genres
Hanjie Zhao | Jinge Xie | Yuchen Yan | Yuxiang Jia | Yawen Ye | Hongying Zan
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

Learnable Conjunction Enhanced Model for Chinese Sentiment Analysis
Bingfei Zhao | Hongying Zan | Jiajia Wang | Yingjie Han
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“Sentiment analysis is a crucial text classification task that aims to extract, process, and analyzeopinions, sentiments, and subjectivity within texts. In current research on Chinese text, sentenceand aspect-based sentiment analysis is mainly tackled through well-designed models. However,despite the importance of word order and function words as essential means of semantic ex-pression in Chinese, they are often underutilized. This paper presents a new Chinese sentimentanalysis method that utilizes a Learnable Conjunctions Enhanced Model (LCEM). The LCEMadjusts the general structure of the pre-trained language model and incorporates conjunctionslocation information into the model’s fine-tuning process. Additionally, we discuss a variantstructure of residual connections to construct a residual structure that can learn critical informa-tion in the text and optimize it during training. We perform experiments on the public datasetsand demonstrate that our approach enhances performance on both sentence and aspect-basedsentiment analysis datasets compared to the baseline pre-trained language models. These resultsconfirm the effectiveness of our proposed method. Introduction”

2022

Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually offering great promise for medical practice. With the development of biomedical language understanding benchmarks, AI applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other languages. To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification, and an associated online platform for model evaluation, comparison, and analysis. To establish evaluation on these tasks, we report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.

ParaZh-22M: A Large-Scale Chinese Parabank via Machine Translation
Wenjie Hao | Hongfei Xu | Deyi Xiong | Hongying Zan | Lingling Mu
Proceedings of the 29th International Conference on Computational Linguistics

Paraphrasing, i.e., restating the same meaning in different ways, is an important data augmentation approach for natural language processing (NLP). Zhang et al. (2019b) propose to extract sentence-level paraphrases from multiple Chinese translations of the same source texts, and construct the PKU Paraphrase Bank of 0.5M sentence pairs. However, despite being the largest Chinese parabank to date, the size of PKU parabank is limited by the availability of one-to-many sentence translation data, and cannot well support the training of large Chinese paraphrasers. In this paper, we relieve the restriction with one-to-many sentence translation data, and construct ParaZh-22M, a larger Chinese parabank that is composed of 22M sentence pairs, based on one-to-one bilingual sentence translation data and machine translation (MT). In our data augmentation experiments, we show that paraphrasing based on ParaZh-22M can bring about consistent and significant improvements over several strong baselines on a wide range of Chinese NLP tasks, including a number of Chinese natural language understanding benchmarks (CLUE) and low-resource machine translation.

期货领域知识图谱构建(Construction of Knowledge Graph in Futures Field)
Wenxin Li (李雯昕) | Hongying Zan (昝红英) | Tongfeng Guan (关同峰) | Yingjie Han (韩英杰)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“期货领域是数据最丰富的领域之一,本文以商品期货的研究报告为数据来源构建了期货领域知识图谱(Commodity Futures Knowledge Graph,CFKG)。以期货产品为核心,确立了概念分类体系及关系描述体系,形成图谱的概念层;在MHS-BIA与GPN模型的基础上,通过领域专家指导对242万字的研报文本进行标注与校对,形成了CFKG数据层,并设计了可视化查询系统。所构建的CFKG包含17,003个农产品期货关系三元组、13,703种非农产品期货关系三元组,为期货领域文本分析、舆情监控和推理决策等应用提供知识支持。”

MMDAG: Multimodal Directed Acyclic Graph Network for Emotion Recognition in Conversation
Shuo Xu | Yuxiang Jia | Changyong Niu | Hongying Zan
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Emotion recognition in conversation is important for an empathetic dialogue system to understand the user’s emotion and then generate appropriate emotional responses. However, most previous researches focus on modeling conversational contexts primarily based on the textual modality or simply utilizing multimodal information through feature concatenation. In order to exploit multimodal information and contextual information more effectively, we propose a multimodal directed acyclic graph (MMDAG) network by injecting information flows inside modality and across modalities into the DAG architecture. Experiments on IEMOCAP and MELD show that our model outperforms other state-of-the-art models. Comparative studies validate the effectiveness of the proposed modality fusion method.

2021

糖尿病电子病历实体及关系标注语料库构建(Construction of Corpus for Entity and Relation Annotation of Diabetes Electronic Medical Records)
Yajuan Ye (叶娅娟) | Bin Hu (胡斌) | Kunli Zhang (张坤丽) | Hongying Zan (昝红英)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

电子病历是医疗信息的重要来源,包含大量与医疗相关的领域知识。本文从糖尿病电子病历文本入手,在调研了国内外已有的电子病历语料库的基础上,参考坉圲坂圲实体及关系分类,建立了糖尿病电子病历实体及实体关系分类体系,并制定了标注规范。利用实体及关系标注平台,进行了实体及关系预标注及多轮人工校对工作,形成了糖尿病电子病历实体及关系标注语料库(Diabetes Electronic Medical Record entity and Related Corpus DEMRC)。所构建的DEMRC包含8899个实体、456个实体修饰及16564个关系。对DEMRC进行一致性评价和分析,标注结果达到了较高的一致性。针对实体识别和实体关系抽取任务,分别采用基于迁移学习的Bi-LSTM-CRF模型和RoBERTa模型进行初步实验,并对语料库中的各类实体及关系进行评估,为后续糖尿病电子病历实体识别及关系抽取研究以及糖尿病知识图谱构建打下基础。

Self-Supervised Curriculum Learning for Spelling Error Correction
Zifa Gan | Hongfei Xu | Hongying Zan
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Spelling Error Correction (SEC) that requires high-level language understanding is a challenging but useful task. Current SEC approaches normally leverage a pre-training then fine-tuning procedure that treats data equally. By contrast, Curriculum Learning (CL) utilizes training data differently during training and has shown its effectiveness in improving both performance and training efficiency in many other NLP tasks. In NMT, a model’s performance has been shown sensitive to the difficulty of training examples, and CL has been shown effective to address this. In SEC, the data from different language learners are naturally distributed at different difficulty levels (some errors made by beginners are obvious to correct while some made by fluent speakers are hard), and we expect that designing a curriculum correspondingly for model learning may also help its training and bring about better performance. In this paper, we study how to further improve the performance of the state-of-the-art SEC method with CL, and propose a Self-Supervised Curriculum Learning (SSCL) approach. Specifically, we directly use the cross-entropy loss as criteria for: 1) scoring the difficulty of training data, and 2) evaluating the competence of the model. In our approach, CL improves the model training, which in return improves the CL measurement. In our experiments on the SIGHAN 2015 Chinese spelling check task, we show that SSCL is superior to previous norm-based and uncertainty-aware approaches, and establish a new state of the art (74.38% F1).

融入篇章信息的文学作品命名实体识别(Document-level Literary Named Entity Recognition)
Yuxiang Jia (贾玉祥) | Rui Chao (晁睿) | Hongying Zan (昝红英) | Huayi Dou (窦华溢) | Shuai Cao (曹帅) | Shuo Xu (徐硕)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

命名实体识别是文学作品智能分析的基础性工作,当前文学领域命名实体识别的研究还较薄弱,一个主要的原因是缺乏标注语料。本文从金庸小说入手,对两部小说180余万字进行了命名实体的标注,共标注4类实体5万多个。针对小说文本的特点,本文提出融入篇章信息的命名实体识别模型,引入篇章字典保存汉字的历史状态,利用可信度计算融合BiGRU-CRF与Transformer模型。实验结果表明,利用篇章信息有效地提升了命名实体识别的效果。最后,我们还探讨了命名实体识别在小说社会网络构建中的应用。

脑卒中疾病电子病历实体及实体关系标注语料库构建(Corpus Construction for Named-Entity and Entity Relations for Electronic Medical Records of Stroke Disease)
Hongyang Chang (常洪阳) | Hongying Zan (昝红英) | Yutuan Ma (马玉团) | Kunli Zhang (张坤丽)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

本文探讨了在脑卒中疾病中文电子病历文本中实体及实体间关系的标注问题,提出了适用于脑卒中疾病电子病历文本的实体及实体关系标注体系和规范。在标注体系和规范的指导下,进行了多轮的人工标注及校正工作,完成了158万余字的脑卒中电子病历文本实体及实体关系的标注工作。构建了脑卒中电子病历实体及实体关系标注语料库(Stroke Electronic Medical Record entity and entity related Corpus SEMRC)。所构建的语料库共包含命名实体10594个,实体关系14457个。实体名标注一致率达到85.16%,实体关系标注一致率达到94.16%。

2020

面向医学文本处理的医学实体标注规范(Medical Entity Annotation Standard for Medical Text Processing)
Huan Zhang (张欢) | Yuan Zong (宗源) | Baobao Chang (常宝宝) | Zhifang Sui (穗志方) | Hongying Zan (昝红英) | Kunli Zhang (张坤丽)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

随着智慧医疗的普及,利用自然语言处理技术识别医学信息的需求日益增长。目前,针对医学实体而言,医学共享语料库仍处于空白状态,这对医学文本信息处理各项任务的进展造成了巨大阻力。如何判断不同的医学实体类别?如何界定不同实体间的涵盖范围?这些问题导致缺乏类似通用场景的大规模规范标注的医学文本数据。针对上述问题,该文参考了UMLS中定义的语义类型,提出面向医学文本信息处理的医学实体标注规范,涵盖了疾病、临床表现、医疗程序、医疗设备等9种医学实体,以及基于规范构建医学实体标注语料库。该文综述了标注规范的描述体系、分类原则、混淆处理、语料标注过程以及医学实体自动标注基线实验等相关问题,希望能为医学实体语料库的构建提供可参考的标注规范,以及为医学实体识别提供语料支持。

Chinese Grammatical Errors Diagnosis System Based on BERT at NLPTEA-2020 CGED Shared Task
Hongying Zan | Yangchao Han | Haotian Huang | Yingjie Yan | Yuke Wang | Yingjie Han
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

In the process of learning Chinese, second language learners may have various grammatical errors due to the negative transfer of native language. This paper describes our submission to the NLPTEA 2020 shared task on CGED. We present a hybrid system that utilizes both detection and correction stages. The detection stage is a sequential labelling model based on BiLSTM-CRF and BERT contextual word representation. The correction stage is a hybrid model based on the n-gram and Seq2Seq. Without adding additional features and external data, the BERT contextual word representation can effectively improve the performance metrics of Chinese grammatical error detection and correction.

Chinese Grammatical Error Diagnosis Based on RoBERTa-BiLSTM-CRF Model
Yingjie Han | Yingjie Yan | Yangchao Han | Rui Chao | Hongying Zan
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

Chinese Grammatical Error Diagnosis (CGED) is a natural language processing task for the NLPTEA6 workshop. The goal of this task is to automatically diagnose grammatical errors in Chinese sentences written by L2 learners. This paper proposes a RoBERTa-BiLSTM-CRF model to detect grammatical errors in sentences. Firstly, RoBERTa model is used to obtain word vectors. Secondly, word vectors are input into BiLSTM layer to learn context features. Last, CRF layer without hand-craft features work for processing the output by BiLSTM. The optimal global sequences are obtained according to state transition matrix of CRF and adjacent labels of training data. In experiments, the result of RoBERTa-CRF model and ERNIE-BiLSTM-CRF model are compared, and the impacts of parameters of the models and the testing datasets are analyzed. In terms of evaluation results, our recall score of RoBERTa-BiLSTM-CRF ranks fourth at the detection level.

Konwledge-Enabled Diagnosis Assistant Based on Obstetric EMRs and Knowledge Graph
Kunli Zhang | Xu Zhao | Lei Zhuang | Qi Xie | Hongying Zan
Proceedings of the 19th Chinese National Conference on Computational Linguistics

The obstetric Electronic Medical Record (EMR) contains a large amount of medical data and health information. It plays a vital role in improving the quality of the diagnosis assistant service. In this paper, we treat the diagnosis assistant as a multi-label classification task and propose a Knowledge-Enabled Diagnosis Assistant (KEDA) model for the obstetric diagnosis assistant. We utilize the numerical information in EMRs and the external knowledge from Chinese Obstetric Knowledge Graph (COKG) to enhance the text representation of EMRs. Specifically, the bidirectional maximum matching method and similarity-based approach are used to obtain the entities set contained in EMRs and linked to the COKG. The final knowledge representation is obtained by a weight-based disease prediction algorithm, and it is fused with the text representation through a linear weighting method. Experiment results show that our approach can bring about +3.53 F1 score improvements upon the strong BERT baseline in the diagnosis assistant task.

2018

Detecting Simultaneously Chinese Grammar Errors Based on a BiLSTM-CRF Model
Yajun Liu | Hongying Zan | Mengjie Zhong | Hongchao Ma
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

In the process of learning and using Chinese, many learners of Chinese as foreign language(CFL) may have grammar errors due to negative migration of their native languages. This paper introduces our system that can simultaneously diagnose four types of grammatical errors including redundant (R), missing (M), selection (S), disorder (W) in NLPTEA-5 shared task. We proposed a Bidirectional LSTM CRF neural network (BiLSTM-CRF) that combines BiLSTM and CRF without hand-craft features for Chinese Grammatical Error Diagnosis (CGED). Evaluation includes three levels, which are detection level, identification level and position level. At the detection level and identification level, our system got the third recall scores, and achieved good F1 values.

Research on Entity Relation Extraction for Military Field
Chen Liang | Hongying Zan | Yajun Liu | Yunfang Wu
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2016

Automatic Grammatical Error Detection for Chinese based on Conditional Random Field
Yajun Liu | Yingjie Han | Liyan Zhuo | Hongying Zan
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

In the process of learning and using Chinese, foreigners may have grammatical errors due to negative migration of their native languages. Currently, the computer-oriented automatic detection method of grammatical errors is not mature enough. Based on the evaluating task — CGED2016, we select and analyze the classification model and design feature extraction method to obtain grammatical errors including Mission(M), Disorder(W), Selection (S) and Redundant (R) automatically. The experiment results based on the dynamic corpus of HSK show that the Chinese grammatical error automatic detection method, which uses CRF as classification model and n-gram as feature extraction method. It is simple and efficient which play a positive effect on the research of Chinese grammatical error automatic detection and also a supporting and guiding role in the teaching of Chinese as a foreign language.

2012

Chinese Personal Name Disambiguation Based on Vector Space Model
Qing-hu Fan | Hong-ying Zan | Yu-mei Chai | Yu-xiang Jia | Gui-ling Niu
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

A Comparison of Chinese Word Segmentation on News and Microblog Corpora with a Lexicon Based Method
Yuxiang Jia | Hongying Zan | Ming Fan | Zhimin Wang
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

2010

Studies on Automatic Recognition of Common Chinese Adverb’s usages Based on Statistics Methods
Hongying Zan | Junhui Zhang | Xuefeng Zhu | Shiwen Yu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

Co-authors

Hongfei Xu (许鸿飞) 4

Lingling Mu (穆玲玲) 3

Baobao Chang (常宝宝) 2

KongLulu KongLulu 2

LiuHaixin LiuHaixin 2

Deyi Xiong (德意熊) 2

Hongyang Chang 1

Xiaoqing Cheng 1

Jiajia Cui (崔佳佳) 1

Pengcheng Dong 1

Tongfeng Guan 1

Haotian Huang 1

Xiaozhuan Liang 1

Changyong Niu 1

Huan Zhang (张欢) 1

Mengjie Zhong 1

Venues