Yijia Zhang

Also published as: 益嘉


2025

"中文语音实体关系三元组抽取任务(Chinese Speech Entity-Relation Triple Extraction Task, CSRTE)是第二十四届中国计算语言学大会中的一项技术评测,旨在从中文语音数据中自动识别并提取实体及其相互关系,构建结构化的语音关系三元组(头实体、关系、尾实体)。本任务的目标是提升中文语音关系三元组抽取的准确性与效率,增强模型在不同语境和复杂语音场景下的鲁棒性,实现从语音输入到文本三元组输出的全流程自动化处理。通过本次评测,有助于推动中文语音信息抽取技术的发展,促进语音与自然语言处理技术的深度融合,为智能应用提供更加丰富且精准的基础数据支持。此次评测共有257支队伍报名参赛,其中59支队伍提交了A榜成绩。成绩排名前15的队伍晋级A榜,并且表现突出的前朷支队伍提交了技术报告。"
Stance detection aims to identify the attitude expressed in text towards a specific target. Recent studies on zero-shot and few-shot stance detection focus primarily on learning generalized representations from explicit targets. However, these methods often neglect implicit yet semantically important targets and fail to adaptively adjust the relative contributions of text and target in light of contextual dependencies. To overcome these limitations, we propose a novel two-stage framework: First, a data augmentation framework named Hierarchical Collaborative Target Augmentation (HCTA) employs Large Language Models (LLMs) to identify and annotate implicit targets via Chain-of-Thought (CoT) prompting and multi-LLM voting, significantly enriching training data with latent semantic relations. Second, we introduce DyMCA, a Dynamic Multi-level Context-aware Attention Network, integrating a joint text-target encoding and a content-aware mechanism to dynamically adjust text-target contributions based on context. Experiments on the benchmark dataset demonstrate that our approach achieves state-of-the-art results, confirming the effectiveness of implicit target augmentation and fine-grained contextual modeling.

2024

Character-based dialogue (CharacterDial) has become essential in the industry (e.g., Character.AI), enabling users to freely customize social characters for social interactions. However, the generalizability and adaptability across various conversational scenarios inherent in customizing social characters still lack public industrial solutions. To address these challenges, by dissecting well-rounded social characters composed of both inherent social profiles and external social behaviors, we manually collect a large-scale Chinese corpus featuring characters with diverse categories and behaviors, and develop CharacterGLM models alongside well-designed refinement methods. Extensive experiments show that CharacterGLM outperforms most popular open- and closed-source LLMs and performs comparably to GPT-4. We will release our data and models for local development and deployment.
“研究突发公共卫生事件国际舆情演变规律,对国际舆情资源进行应急管理和舆论疏导有重要借鉴价值。本文使用谷歌新闻数据库以各国针对COVID-19的报道为对象,构建国际舆情数据集。采用主题模型、图神经网络模型,结合时间、空间维度与舆情生命周期探究全球舆论主题-情感的演化态势,模型准确率为0.7973,F1值为0.7826,性能优于其他基线模型。研究发现,各国舆情呈现放射传播状态。国际媒体舆论的情感倾向和讨论主题存在正相关且随时间进行转变。”
Large language models (LLMs) show great performance in various tasks, but face deployment challenges from limited memory capacity and bandwidth.Low-bit weight quantization can save memory and accelerate inference.Although floating-point (FP) formats show good performance in LLM quantization, they tend to perform poorly with small group sizes or sub-4 bits.We find the reason is that the absence of asymmetry in previous FP quantization makes it unsuitable for handling asymmetric value distribution of LLM weight tensors.In this work, we propose asymmetric FP quantization (AFPQ), which sets separate scales for positive and negative values.Our method leads to large accuracy improvements and can be easily plugged into other quantization methods, including GPTQ and AWQ, for better performance.Besides, no additional storage is needed compared with asymmetric integer (INT) quantization.The code is available at https://github.com/zhangsichengsjtu/AFPQ.
The upscaling of Large Language Models (LLMs) has yielded impressive advances in natural language processing, yet it also poses significant deployment challenges. Weight quantization has emerged as a widely embraced solution to reduce memory and computational demands. This paper introduces BitDistiller, a framework that synergizes Quantization-Aware Training (QAT) with Knowledge Distillation (KD) to boost the performance of LLMs at ultra-low precisions (sub-4-bit). Specifically, BitDistiller first incorporates a tailored asymmetric quantization and clipping technique to maximally preserve the fidelity of quantized weights, and then proposes a novel Confidence-Aware Kullback-Leibler Divergence (CAKLD) objective, which is employed in a self-distillation manner to enable faster convergence and superior model performance. Empirical evaluations demonstrate that BitDistiller significantly surpasses existing methods in both 3-bit and 2-bit configurations on general language understanding and complex reasoning benchmarks. Notably, BitDistiller is shown to be more cost-effective, demanding fewer data and training resources. The code is available at https://github.com/DD-DuDa/BitDistiller.

2023

“Zero-shot stance detection intends to detect previously unseen targets’ stances in the testingphase. However, achieving this goal can be difficult, as it requires minimizing the domain trans-fer between different targets, and improving the model’s inference and generalization abilities. To address this challenge, we propose an adversarial network with external knowledge (ANEK)model. Specifically, we adopt adversarial learning based on pre-trained models to learn transfer-able knowledge from the source targets, thereby enabling the model to generalize well to unseentargets. Additionally, we incorporate sentiment information and common sense knowledge intothe contextual representation to further enhance the model’s understanding. Experimental re-sults on several datasets reveal that our method achieves excellent performance, demonstratingits validity and feasibility.”
“The purpose of Aspect Sentiment Triplet Extraction (ASTE) is to extract a triplet, including thetarget or aspect, its associated sentiment, and related opinion terms that explain the underlyingcause of the sentiment. Some recent studies fail to capture the strong interdependence betweenATE and OTE, while others fail to effectively introduce the relationship between aspects andopinions into sentiment classification tasks. To solve these problems, we construct a multi-roundmachine reading comprehension framework based on a rethink mechanism to solve ASTE tasksefficiently. The rethink mechanism allows the framework to model complex relationships be-tween entities, and exclusive classifiers and probability generation algorithms can reduce queryconflicts and unilateral drops in probability. Besides, the multi-round structure can fuse explicitsemantic information flow between aspect, opinion and sentiment. Extensive experiments showthat the proposed model achieves the most advanced effect and can be effectively applied toASTE tasks.”
“Multimodal Named Entity Recognition (MNER) is a challenging task in social mediadue to the combination of text and image features. Previous MNER work has focused onpredicting entity information after fusing visual and text features. However, pre-traininglanguage models have already acquired vast amounts of knowledge during their pre-training process. To leverage this knowledge, we propose a prompt network for MNERtasks (P-MNER).To minimize the noise generated by irrelevant areas in the image, wedesign a visual feature extraction model (FRR) based on FasterRCNN and ResNet, whichuses fine-grained visual features to assist MNER tasks. Moreover, we introduce a textcorrection fusion module (TCFM) into the model to address visual bias during modalfusion. We employ the idea of a residual network to modify the fused features using theoriginal text features. Our experiments on two benchmark datasets demonstrate that ourproposed model outperforms existing MNER methods. P-MNER’s ability to leveragepre-training knowledge from language models, incorporate fine-grained visual features,and correct for visual bias, makes it a promising approach for multimodal named entityrecognition in social media posts.”