Long Zhang

Also published as:


2024

pdf bib
Global-Pruner: A Stable and Efficient Pruner for Retraining-Free Pruning of Encoder-Based Language Models
Guangzhen Yao | Yuehan Wang | Hui Xu | Long Zhang | MiaoQI MiaoQI
Proceedings of the 28th Conference on Computational Natural Language Learning

Large language models (LLMs) have achieved significant success in complex tasks across various domains, but they come with high computational costs and inference latency issues. Pruning, as an effective method, can significantly reduce inference costs. However, current pruning algorithms for encoder-based language models often focus on locally optimal solutions, neglecting a comprehensive exploration of the global solution space. This oversight can lead to instability in the solution process, thereby affecting the overall performance of the model. To address these challenges, we propose a structured pruning algorithm named G-Pruner (Global Pruner), comprising two integral components: PPOM (Proximal Policy Optimization Mask) and CG²MT (Conjugate Gradient Squared Mask Tuning), utilizing a global optimization strategy. This strategy not only eliminates the need for retraining but also ensures the algorithm’s stability and adaptability to environmental changes, effectively addressing the issue of focusing solely on immediate optima while neglecting long-term effects. This method is evaluated on the GLUE and SQuAD benchmarks using BERTBASE and DistilBERT models. The experimental results indicate that without any retraining, G-Pruner achieves significant accuracy improvements on the SQuAD2.0 task with a FLOPs constraint of 60%, demonstrating a 6.02% increase in F1 score compared with baseline algorithms.

2023

pdf bib
CCL23-Eval 任务6系统报告:基于深度学习的电信网络诈骗案件分类(System Report for CCL23-Eval Task 6: Classification of Telecom Internet Fraud Cases Based on Deep Learning)
Chenyang Li (李晨阳) | Long Zhang (张龙) | Zhongjie Zhao (赵中杰) | Hui Guo (郭辉)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“文本分类任务作为自然语言处理领域的基础任务,在面向电信网络诈骗领域的案件分类中扮演着至关重要的角色,对于智能化案件分析具有重大意义和深远影响。本任务的目的是对给定案件描述文本进行分类,案件文本包含对案件的经过脱敏处理后的整体描述。我们首先采用Ernie预训练模型对案件内容进行微调的方法得到每个案件的类别,再使用伪标签和模型融合方法对目前的F1值进行提升,最终在CCL23-Eval任务6电信网络诈骗案件分类评测中取得第二名的成绩,该任务的评价指标F1值为0.8628,达到了较为先进的检测效果。”

2021

pdf bib
Point, Disambiguate and Copy: Incorporating Bilingual Dictionaries for Neural Machine Translation
Tong Zhang | Long Zhang | Wei Ye | Bo Li | Jinan Sun | Xiaoyu Zhu | Wen Zhao | Shikun Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper proposes a sophisticated neural architecture to incorporate bilingual dictionaries into Neural Machine Translation (NMT) models. By introducing three novel components: Pointer, Disambiguator, and Copier, our method PDC achieves the following merits inherently compared with previous efforts: (1) Pointer leverages the semantic information from bilingual dictionaries, for the first time, to better locate source words whose translation in dictionaries can potentially be used; (2) Disambiguator synthesizes contextual information from the source view and the target view, both of which contribute to distinguishing the proper translation of a specific source word from multiple candidates in dictionaries; (3) Copier systematically connects Pointer and Disambiguator based on a hierarchical copy mechanism seamlessly integrated with Transformer, thereby building an end-to-end architecture that could avoid error propagation problems in alternative pipe-line methods. The experimental results on Chinese-English and English-Japanese benchmarks demonstrate the PDC’s overall superiority and effectiveness of each component.

pdf bib
Multi-Hop Transformer for Document-Level Machine Translation
Long Zhang | Tong Zhang | Haibo Zhang | Baosong Yang | Wei Ye | Shikun Zhang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information. Nevertheless, existing approaches 1) simply introduce the representations of context sentences without explicitly characterizing the inter-sentence reasoning process; and 2) feed ground-truth target contexts as extra inputs at the training time, thus facing the problem of exposure bias. We approach these problems with an inspiration from human behavior – human translators ordinarily emerge a translation draft in their mind and progressively revise it according to the reasoning in discourse. To this end, we propose a novel Multi-Hop Transformer (MHT) which offers NMT abilities to explicitly model the human-like draft-editing and reasoning process. Specifically, our model serves the sentence-level translation as a draft and properly refines its representations by attending to multiple antecedent sentences iteratively. Experiments on four widely used document translation tasks demonstrate that our method can significantly improve document-level translation performance and can tackle discourse phenomena, such as coreference error and the problem of polysemy.

2019

pdf bib
PKUSE at SemEval-2019 Task 3: Emotion Detection with Emotion-Oriented Neural Attention Network
Luyao Ma | Long Zhang | Wei Ye | Wenhui Hu
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper presents the system in SemEval-2019 Task 3, “EmoContext: Contextual Emotion Detection in Text”. We propose a deep learning architecture with bidirectional LSTM networks, augmented with an emotion-oriented attention network that is capable of extracting emotion information from an utterance. Experimental results show that our model outperforms its variants and the baseline. Overall, this system has achieved 75.57% for the microaveraged F1 score.