Yuexiang Xie


2025

pdf bib
Language Adaptation of Large Language Models: An Empirical Study on LLaMA2
Shumin Wang | Yuexiang Xie | Bolin Ding | Jinyang Gao | Yanyong Zhang
Proceedings of the 31st International Conference on Computational Linguistics

There has been a surge of interest regarding language adaptation of Large Language Models (LLMs) to enhance the processing of texts in low-resource languages. While traditional language models have seen extensive research on language transfer, modern LLMs still necessitate further explorations in language adaptation. In this paper, we present a systematic review of the language adaptation process for LLMs, including vocabulary expansion, continued pre-training, and instruction fine-tuning, which focuses on empirical studies conducted on LLaMA2 and discussions on various settings affecting the model’s capabilities. This study provides helpful insights covering the entire language adaptation process, and highlights the compatibility and interactions between different steps, offering researchers a practical guidebook to facilitate the effective adaptation of LLMs across different languages.

pdf bib
Enhancing Factual Consistency in Text Summarization via Counterfactual Debiasing
Zhenqing Ling | Yuexiang Xie | Chenhe Dong | Ying Shen
Proceedings of the 31st International Conference on Computational Linguistics

Despite significant progress in abstractive text summarization aimed at generating fluent and informative outputs, how to ensure the factual consistency of generated summaries remains a crucial and challenging issue. In this study, drawing inspiration from advancements in causal inference, we construct causal graphs to analyze the process of abstractive text summarization methods and identify intrinsic causes of factual inconsistency, specifically language bias and irrelevancy bias, and we propose CoFactSum, a novel framework that mitigates the causal effects of these biases through counterfactual estimation for enhancing the factual consistency of the generated content. CoFactSum provides two counterfactual estimation strategies, including Explicit Counterfactual Masking, which employs a dynamic masking approach, and Implicit Counterfactual Training, which utilizes a discriminative cross-attention mechanism. Besides, we propose a Debiasing Degree Adjustment mechanism to dynamically calibrate the level of debiasing at each decoding step. Extensive experiments conducted on two widely used summarization datasets demonstrate the effectiveness and advantages of the proposed CoFactSum in enhancing the factual consistency of generated summaries, outperforming several baseline methods.

2024

pdf bib
When to Trust LLMs: Aligning Confidence with Response Quality
Shuchang Tao | Liuyi Yao | Hanxing Ding | Yuexiang Xie | Qi Cao | Fei Sun | Jinyang Gao | Huawei Shen | Bolin Ding
Findings of the Association for Computational Linguistics: ACL 2024

Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective guidance. To address this, we propose CONfidence-Quality-ORDer-preserving alignment approach (CONQORD), which leverages reinforcement learning guided by a tailored dual-component reward function. This function integrates quality reward and order-preserving alignment reward functions. Specifically, the order-preserving reward incentivizes the model to verbalize greater confidence for responses of higher quality to align the order of confidence and quality. Experiments demonstrate that CONQORD significantly improves the alignment performance between confidence and response accuracy, without causing over-cautious. Furthermore, the aligned confidence provided by CONQORD informs when to trust LLMs, and acts as a determinant for initiating the retrieval process of external knowledge. Aligning confidence with response quality ensures more transparent and reliable responses, providing better trustworthiness.

2023

pdf bib
Tunable Soft Prompts are Messengers in Federated Learning
Chenhe Dong | Yuexiang Xie | Bolin Ding | Ying Shen | Yaliang Li
Findings of the Association for Computational Linguistics: EMNLP 2023

Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources, alleviating privacy concerns that arise from directly sharing local data. However, the lack of model privacy protection in FL becomes an unneglectable challenge, especially when people want to federally finetune models based on a proprietary large language model. In this study, we propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts. These soft prompts, updated and transmitted between the server and clients, assume the role of the global model parameters and serve as messengers to deliver useful knowledge from the local data and global model. As the global model itself is not required to be shared and the local training is conducted based on an auxiliary model with fewer parameters than the global model, the proposed approach provides protection for the global model while reducing communication and computation costs in FL. Extensive experiments show the effectiveness of the proposed approach compared to several baselines. We have released the source code at https://github.com/alibaba/FederatedScope/tree/fedsp/federatedscope/nlp/fedsp.

2021

pdf bib
Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Yuexiang Xie | Fei Sun | Yang Deng | Yaliang Li | Bolin Ding
Findings of the Association for Computational Linguistics: EMNLP 2021

Despite significant progress has been achieved in text summarization, factual inconsistency in generated summaries still severely limits its practical applications. Among the key factors to ensure factual consistency, a reliable automatic evaluation metric is the first and the most crucial one. However, existing metrics either neglect the intrinsic cause of the factual inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation with human judgments or increasing the inconvenience of usage in practice. In light of these challenges, we propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation, which formulates the causal relationship among the source document, the generated summary, and the language prior. We remove the effect of language prior, which can cause factual inconsistency, from the total causal effect on the generated summary, and provides a simple yet effective way to evaluate consistency without relying on other auxiliary tasks. We conduct a series of experiments on three public abstractive text summarization datasets, and demonstrate the advantages of the proposed metric in both improving the correlation with human judgments and the convenience of usage. The source code is available at https://github.com/xieyxclack/factual_coco.