Qingrong Xia


2024

pdf bib
Adaptive Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization
Yixin Ji | Yang Xiang | Juntao Li | Qingrong Xia | Zi Ye | Xinyu Duan | Zhefeng Wang | Kehai Chen | Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024

In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank compression, a promising technique, reduces non-essential parameters by decomposing weight matrices into products of two low-rank matrices. Yet, its application in LLMs has not been extensively studied. The key to low-rank compression lies in low-rank factorization and low-rank dimensions allocation. To address the challenges of low-rank compression in LLMs, we conduct empirical research on the low-rank characteristics of large models. We propose a low-rank compression method suitable for LLMs. This approach involves precise estimation of feature distributions through pooled covariance matrices and a Bayesian optimization strategy for allocating low-rank dimensions. Experiments on the LLaMA-2 models demonstrate that our method outperforms existing strong structured pruning and low-rank compression techniques in maintaining model performance at the same compression ratio.

2023

pdf bib
AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing
Asaad Alghamdi | Xinyu Duan | Wei Jiang | Zhenhai Wang | Yimeng Wu | Qingrong Xia | Zhefeng Wang | Yi Zheng | Mehdi Rezagholizadeh | Baoxing Huai | Peilun Cheng | Abbas Ghaddar
Findings of the Association for Computational Linguistics: ACL 2023

Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP). In this work, we present AraMUS, the largest Arabic PLM with 11B parameters trained on 529GB of high-quality Arabic textual data. AraMUS achieves state-of-the-art performances on a diverse set of Arabic classification and generative tasks. Moreover, AraMUS shows impressive few-shot learning abilities compared with the best existing Arabic PLMs.

2022

pdf bib
MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset
Yahui Liu | Haoping Yang | Chen Gong | Qingrong Xia | Zhenghua Li | Min Zhang
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

During the past decade, neural network models have made tremendous progress on in-domain semantic role labeling (SRL). However, performance drops dramatically under the out-of-domain setting. In order to facilitate research on cross-domain SRL, this paper presents MuCPAD, a multi-domain Chinese predicate-argument dataset, which consists of 30,897 sentences and 92,051 predicates from six different domains. MuCPAD exhibits three important features. 1) Based on a frame-free annotation methodology, we avoid writing complex frames for new predicates. 2) We explicitly annotate omitted core arguments to recover more complete semantic structure, considering that omission of content words is ubiquitous in multi-domain Chinese texts. 3) We compile 53 pages of annotation guidelines and adopt strict double annotation for improving data quality. This paper describes in detail the annotation methodology and annotation process of MuCPAD, and presents in-depth data analysis. We also give benchmark results on cross-domain SRL based on MuCPAD.

pdf bib
Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based Graph Parsing
Shilin Zhou | Qingrong Xia | Zhenghua Li | Yu Zhang | Yu Hong | Min Zhang
Proceedings of the 29th International Conference on Computational Linguistics

This paper proposes to cast end-to-end span-based SRL as a word-based graph parsing task. The major challenge is how to represent spans at the word level. Borrowing ideas from research on Chinese word segmentation and named entity recognition, we propose and compare four different schemata of graph representation, i.e., BES, BE, BIES, and BII, among which we find that the BES schema performs the best. We further gain interesting insights through detailed analysis. Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure. We conduct experiments on two widely used benchmark datasets, i.e., CoNLL05 and CoNLL12. Results show that our word-based graph parsing approach achieves consistently better performance than previous results, under all settings of end-to-end and predicate-given, without and with pre-trained language models (PLMs). More importantly, our model can parse 669/252 sentences per second, without and with PLMs respectively.

pdf bib
Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures inside Arguments
Yu Zhang | Qingrong Xia | Shilin Zhou | Yong Jiang | Guohong Fu | Min Zhang
Proceedings of the 29th International Conference on Computational Linguistics

Semantic role labeling (SRL) is a fundamental yet challenging task in the NLP community. Recent works of SRL mainly fall into two lines: 1) BIO-based; 2) span-based. Despite ubiquity, they share some intrinsic drawbacks of not considering internal argument structures, potentially hindering the model’s expressiveness. The key challenge is arguments are flat structures, and there are no determined subtree realizations for words inside arguments. To remedy this, in this paper, we propose to regard flat argument spans as latent subtrees, accordingly reducing SRL to a tree parsing task. In particular, we equip our formulation with a novel span-constrained TreeCRF to make tree structures span-aware and further extend it to the second-order case. We conduct extensive experiments on CoNLL05 and CoNLL12 benchmarks. Results reveal that our methods perform favorably better than all previous syntax-agnostic works, achieving new state-of-the-art under both end-to-end and w/ gold predicates settings.

2021

pdf bib
A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents
Qingrong Xia | Bo Zhang | Rui Wang | Zhenghua Li | Yue Zhang | Fei Huang | Luo Si | Min Zhang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Fine-grained opinion mining (OM) has achieved increasing attraction in the natural language processing (NLP) community, which aims to find the opinion structures of “Who expressed what opinions towards what” in one sentence. In this work, motivated by its span-based representations of opinion expressions and roles, we propose a unified span-based approach for the end-to-end OM setting. Furthermore, inspired by the unified span-based formalism of OM and constituent parsing, we explore two different methods (multi-task learning and graph convolutional neural network) to integrate syntactic constituents into the proposed model to help OM. We conduct experiments on the commonly used MPQA 2.0 dataset. The experimental results show that our proposed unified span-based approach achieves significant improvements over previous works in the exact F1 score and reduces the number of wrongly-predicted opinion expressions and roles, showing the effectiveness of our method. In addition, incorporating the syntactic constituents achieves promising improvements over the strong baseline enhanced by contextualized word representations.

pdf bib
Stacked AMR Parsing with Silver Data
Qingrong Xia | Zhenghua Li | Rui Wang | Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

Lacking sufficient human-annotated data is one main challenge for abstract meaning representation (AMR) parsing. To alleviate this problem, previous works usually make use of silver data or pre-trained language models. In particular, one recent seq-to-seq work directly fine-tunes AMR graph sequences on the encoder-decoder pre-trained language model and achieves new state-of-the-art results, outperforming previous works by a large margin. However, it makes the decoding relatively slower. In this work, we investigate alternative approaches to achieve competitive performance at faster speeds. We propose a simplified AMR parser and a pre-training technique for the effective usage of silver data. We conduct extensive experiments on the widely used AMR2.0 dataset and the results demonstrate that our Transformer-based AMR parser achieves the best performance among the seq2graph-based models. Furthermore, with silver data, our model achieves competitive results with the SOTA model, and the speed is an order of magnitude faster. Detailed analyses are conducted to gain more insights into our proposed model and the effectiveness of the pre-training technique.

2020

pdf bib
Semantic Role Labeling with Heterogeneous Syntactic Knowledge
Qingrong Xia | Rui Wang | Zhenghua Li | Yue Zhang | Min Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Recently, due to the interplay between syntax and semantics, incorporating syntactic knowledge into neural semantic role labeling (SRL) has achieved much attention. Most of the previous syntax-aware SRL works focus on explicitly modeling homogeneous syntactic knowledge over tree outputs. In this work, we propose to encode heterogeneous syntactic knowledge for SRL from both explicit and implicit representations. First, we introduce graph convolutional networks to explicitly encode multiple heterogeneous dependency parse trees. Second, we extract the implicit syntactic representations from syntactic parser trained with heterogeneous treebanks. Finally, we inject the two types of heterogeneous syntax-aware representations into the base SRL model as extra inputs. We conduct experiments on two widely-used benchmark datasets, i.e., Chinese Proposition Bank 1.0 and English CoNLL-2005 dataset. Experimental results show that incorporating heterogeneous syntactic knowledge brings significant improvements over strong baselines. We further conduct detailed analysis to gain insights on the usefulness of heterogeneous (vs. homogeneous) syntactic knowledge and the effectiveness of our proposed approaches for modeling such knowledge.

2019

pdf bib
SUDA-Alibaba at MRP 2019: Graph-Based Models with BERT
Yue Zhang | Wei Jiang | Qingrong Xia | Junjie Cao | Rui Wang | Zhenghua Li | Min Zhang
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

In this paper, we describe our participating systems in the shared task on Cross- Framework Meaning Representation Parsing (MRP) at the 2019 Conference for Computational Language Learning (CoNLL). The task includes five frameworks for graph-based meaning representations, i.e., DM, PSD, EDS, UCCA, and AMR. One common characteristic of our systems is that we employ graph-based methods instead of transition-based methods when predicting edges between nodes. For SDP, we jointly perform edge prediction, frame tagging, and POS tagging via multi-task learning (MTL). For UCCA, we also jointly model a constituent tree parsing and a remote edge recovery task. For both EDS and AMR, we produce nodes first and edges second in a pipeline fashion. External resources like BERT are found helpful for all frameworks except AMR. Our final submission ranks the third on the overall MRP evaluation metric, the first on EDS and the second on UCCA.

pdf bib
A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
Qingrong Xia | Zhenghua Li | Min Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Semantic role labeling (SRL) aims to identify the predicate-argument structure of a sentence. Inspired by the strong correlation between syntax and semantics, previous works pay much attention to improve SRL performance on exploiting syntactic knowledge, achieving significant results. Pipeline methods based on automatic syntactic trees and multi-task learning (MTL) approaches using standard syntactic trees are two common research orientations. In this paper, we adopt a simple unified span-based model for both span-based and word-based Chinese SRL as a strong baseline. Besides, we present a MTL framework that includes the basic SRL module and a dependency parser module. Different from the commonly used hard parameter sharing strategy in MTL, the main idea is to extract implicit syntactic representations from the dependency parser as external inputs for the basic SRL model. Experiments on the benchmarks of Chinese Proposition Bank 1.0 and CoNLL-2009 Chinese datasets show that our proposed framework can effectively improve the performance over the strong baselines. With the external BERT representations, our framework achieves new state-of-the-art 87.54 and 88.5 F1 scores on the two test data of the two benchmarks, respectively. In-depth analysis are conducted to gain more insights on the proposed framework and the effectiveness of syntax.

2017

pdf bib
Dependency Parsing with Partial Annotations: An Empirical Comparison
Yue Zhang | Zhenghua Li | Jun Lang | Qingrong Xia | Min Zhang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper describes and compares two straightforward approaches for dependency parsing with partial annotations (PA). The first approach is based on a forest-based training objective for two CRF parsers, i.e., a biaffine neural network graph-based parser (Biaffine) and a traditional log-linear graph-based parser (LLGPar). The second approach is based on the idea of constrained decoding for three parsers, i.e., a traditional linear graph-based parser (LGPar), a globally normalized neural network transition-based parser (GN3Par) and a traditional linear transition-based parser (LTPar). For the test phase, constrained decoding is also used for completing partial trees. We conduct experiments on Penn Treebank under three different settings for simulating PA, i.e., random, most uncertain, and divergent outputs from the five parsers. The results show that LLGPar is most effective in directly learning from PA, and other parsers can achieve best performance when PAs are completed into full trees by LLGPar.