Jie Wang


2024

pdf bib
Revisiting Interpolation Augmentation for Speech-to-Text Generation
Chen Xu | Jie Wang | Xiaoqian Liu | Qian Dong | Chunliang Zhang | Tong Xiao | JingBo Zhu | Dapeng Man | Wu Yang
Findings of the Association for Computational Linguistics: ACL 2024

Speech-to-text (S2T) generation systems frequently face challenges in low-resource scenarios, primarily due to the lack of extensive labeled datasets. One emerging solution is constructing virtual training samples by interpolating inputs and labels, which has notably enhanced system generalization in other domains. Despite its potential, this technique’s application in S2T tasks has remained under-explored. In this paper, we delve into the utility of interpolation augmentation, guided by several pivotal questions. Our findings reveal that employing an appropriate strategy in interpolation augmentation significantly enhances performance across diverse tasks, architectures, and data scales, offering a promising avenue for more robust S2T systems in resource-constrained settings.

pdf bib
Length Generalization of Causal Transformers without Position Encoding
Jie Wang | Tao Ji | Yuanbin Wu | Hang Yan | Tao Gui | Qi Zhang | Xuanjing Huang | Xiaoling Wang
Findings of the Association for Computational Linguistics: ACL 2024

Generalizing to longer sentences is important for recent Transformer-based language models. Besides algorithms manipulating explicit position features, the success of Transformers without position encodings (NoPE) provides a new way to overcome the challenge. In this paper, we study the length generalization property of NoPE. We find that although NoPE can extend to longer sequences than the commonly used explicit position encodings, it still has a limited context length. We identify a connection between the failure of NoPE’s generalization and the distraction of attention distributions. We propose a parameter-efficient tuning for searching attention heads’ best temperature hyper-parameters, which substantially expands NoPE’s context size. Experiments on long sequence language modeling, the synthetic passkey retrieval task and real-world long context tasks show that NoPE can achieve competitive performances with state-of-the-art length generalization algorithms. The source code is publicly accessible

pdf bib
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graph
Hanzhu Chen | Xu Shen | Qitan Lv | Jie Wang | Xiaoqi Ni | Jieping Ye
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Knowledge graphs (KGs) play a pivotal role in knowledge-intensive tasks across specialized domains, where the acquisition of precise and dependable knowledge is crucial. However, existing KG construction methods heavily rely on human intervention to attain qualified KGs, which severely hinders the practical applicability in real-world scenarios. To address this challenge, we propose a general KG construction framework, named **SAC-KG**, to exploit large language models (LLMs) as **S**killed **A**utomatic **C**onstructors for domain **K**nowledge **G**raph. SAC-KG effectively involves LLMs as domain experts to generate specialized and precise multi-level KGs. Specifically, SAC-KG consists of three components: Generator, Verifier, and Pruner. For a given entity, Generator produces its relations and tails from raw domain corpora, to construct a specialized single-level KG. Verifier and Pruner then work together to ensure precision by correcting generation errors and determining whether newly produced tails require further iteration for the next-level KG. Experiments demonstrate that SAC-KG automatically constructs a domain KG at the scale of over one million nodes and achieves a precision of 89.32%, leading to a superior performance with over 20% increase in precision rate compared to existing state-of-the-art methods for the KG construction task.

pdf bib
YNU-HPCC at SemEval-2024 Task 9: Using Pre-trained Language Models with LoRA for Multiple-choice Answering Tasks
Jie Wang | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This study describes the model built in Task 9: brainteaser in the SemEval-2024 competition, which is a multiple-choice task. As active participants in Task 9, our system strategically employs the decoding-enhanced BERT (DeBERTa) architecture enriched with disentangled attention mechanisms. Additionally, we fine-tuned our model using low-rank adaptation (LoRA) to optimize its performance further. Moreover, we integrate focal loss into our framework to address label imbalance issues. The systematic integration of these techniques has resulted in outstanding performance metrics. Upon evaluation using the provided test dataset, our system showcases commendable results, with a remarkable accuracy score of 0.9 for subtask 1, positioning us fifth among all participants. Similarly, for subtask 2, our system exhibits a substantial accuracy rate of 0.781, securing a commendable seventh-place ranking. The code for this paper is published at: https://github.com/123yunnandaxue/Semveal-2024_task9.

2022

pdf bib
The NiuTrans Machine Translation Systems for WMT22
Weiqiao Shan | Zhiquan Cao | Yuchen Han | Siming Wu | Yimin Hu | Jie Wang | Yi Zhang | Hou Baoyu | Hang Cao | Chenghao Gao | Xiaowen Liu | Tong Xiao | Anxiang Ma | Jingbo Zhu
Proceedings of the Seventh Conference on Machine Translation (WMT)

This paper describes the NiuTrans neural machine translation systems of the WMT22 General MT constrained task. We participate in four directions, including Chinese→English, English→Croatian, and Livonian↔English. Our models are based on several advanced Transformer variants, e.g., Transformer-ODE, Universal Multiscale Transformer (UMST). The main workflow consists of data filtering, large-scale data augmentation (i.e., iterative back-translation, iterative knowledge distillation), and specific-domain fine-tuning. Moreover, we try several multi-domain methods, such as a multi-domain model structure and a multi-domain data clustering method, to rise to this year’s newly proposed multi-domain test set challenge. For low-resource scenarios, we build a multi-language translation model to enhance the performance, and try to use the pre-trained language model (mBERT) to initialize the translation model.

pdf bib
Unregulated Chinese-to-English Data Expansion Does NOT Work for Neural Event Detection
Zhongqiu Li | Yu Hong | Jie Wang | Shiming He | Jianmin Yao | Guodong Zhou
Proceedings of the 29th International Conference on Computational Linguistics

We leverage cross-language data expansion and retraining to enhance neural Event Detection (abbr., ED) on English ACE corpus. Machine translation is utilized for expanding English training set of ED from that of Chinese. However, experimental results illustrate that such strategy actually results in performance degradation. The survey of translations suggests that the mistakenly-aligned triggers in the expanded data negatively influences the retraining process. We refer this phenomenon to “trigger falsification”. To overcome the issue, we apply heuristic rules for regulating the expanded data, fixing the distracting samples that contain the falsified triggers. The supplementary experiments show that the rule-based regulation is beneficial, yielding the improvement of about 1.6% F1-score for ED. We additionally prove that, instead of transfer learning from the translated ED data, the straight data combination by random pouring surprisingly performs better.

2021

pdf bib
Deep Cognitive Reasoning Network for Multi-hop Question Answering over Knowledge Graphs
Jianyu Cai | Zhanqiu Zhang | Feng Wu | Jie Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2015

pdf bib
KWB: An Automated Quick News System for Chinese Readers
Yiqi Bai | Wenjing Yang | Hao Zhang | Jingwen Wang | Ming Jia | Roland Tong | Jie Wang
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing