Rui Song


2024

pdf bib
Data-Centric Explainable Debiasing for Improving Fairness in Pre-trained Language Models
Yingji Li | Mengnan Du | Rui Song | Xin Wang | Ying Wang
Findings of the Association for Computational Linguistics: ACL 2024

Human-like social bias of pre-trained language models (PLMs) on downstream tasks have attracted increasing attention. The potential flaws in the training data are the main factor that causes unfairness in PLMs. Existing data-centric debiasing strategies mainly leverage explicit bias words (defined as sensitive attribute words specific to demographic groups) for counterfactual data augmentation to balance the training data. However, they lack consideration of implicit bias words potentially associated with explicit bias words in complex distribution data, which indirectly harms the fairness of PLMs. To this end, we propose a **Data**-Centric **Debias**ing method (named Data-Debias), which uses an explainability method to search for implicit bias words to assist in debiasing PLMs. Specifically, we compute the feature attributions of all tokens using the Integrated Gradients method, and then treat the tokens that have a large impact on the model’s decision as implicit bias words. To make the search results more precise, we iteratively train a biased model to amplify the bias with each iteration. Finally, we use the implicit bias words searched in the last iteration to assist in debiasing PLMs. Extensive experimental results on multiple PLMs debiasing on three different classification tasks demonstrate that Data-Debias achieves state-of-the-art debiasing performance and strong generalization while maintaining predictive abilities.

pdf bib
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
Jian Li | Haojing Huang | Yujia Zhang | Pengfei Xu | Xi Chen | Rui Song | Lida Shi | Jingwen Wang | Hao Xu
Findings of the Association for Computational Linguistics: EMNLP 2024

Recently, there has been significant interest in replacing the reward model in Reinforcement Learning with Human Feedback (RLHF) methods for Large Language Models (LLMs), such as Direct Preference Optimization (DPO) and its variants. These approaches commonly use a binary cross-entropy mechanism on pairwise samples, i.e., minimizing and maximizing the loss based on preferred or dis-preferred responses, respectively. However, while this training strategy omits the reward model, it also overlooks the varying preference degrees within different responses. We hypothesize that this is a key factor hindering LLMs from sufficiently understanding human preferences. To address this problem, we propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss, thereby helping LLMs improve their ability to understand the degree of preference. Extensive experiments are conducted on two widely used datasets of different tasks. The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods and significantly boost their performance to achieve state-of-the-art performance. We also conduct detailed analyses to offer comprehensive insights into SPO, which verifies its effectiveness. The code is available at https://github.com/lijian16/SPO.

2022

pdf bib
Locally Aggregated Feature Attribution on Natural Language Model Understanding
Sheng Zhang | Jin Wang | Haitao Jiang | Rui Song
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

With the growing popularity of deep-learning models, model understanding becomes more important. Much effort has been devoted to demystify deep neural networks for better explainability. Some feature attribution methods have shown promising results in computer vision, especially the gradient-based methods where effectively smoothing the gradients with reference data is the key to a robust and faithful result. However, direct application of these gradient-based methods to NLP tasks is not trivial due to the fact that the input consists of discrete tokens and the “reference” tokens are not explicitly defined. In this work, we propose Locally Aggregated Feature Attribution (LAFA), a novel gradient-based feature attribution method for NLP models. Instead of relying on obscure reference tokens, it smooths gradients by aggregating similar reference texts derived from language model embeddings. For evaluation purpose, we also design experiments on different NLP tasks including Entity Recognition and Sentiment Analysis on public datasets and key words detection on constructed Amazon catalogue dataset. The superior performance of the proposed method is demonstrated through experiments.

pdf bib
A Simple Contrastive Learning Framework for Interactive Argument Pair Identification via Argument-Context Extraction
Lida Shi | Fausto Giunchiglia | Rui Song | Daqian Shi | Tongtong Liu | Xiaolei Diao | Hao Xu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Interactive argument pair identification is an emerging research task for argument mining, aiming to identify whether two arguments are interactively related. It is pointed out that the context of the argument is essential to improve identification performance. However, current context-based methods achieve limited improvements since the entire context typically contains much irrelevant information. In this paper, we propose a simple contrastive learning framework to solve this problem by extracting valuable information from the context. This framework can construct hard argument-context samples and obtain a robust and uniform representation by introducing contrastive learning. We also propose an argument-context extraction module to enhance information extraction by discarding irrelevant blocks. The experimental results show that our method achieves the state-of-the-art performance on the benchmark dataset. Further analysis demonstrates the effectiveness of our proposed modules and visually displays more compact semantic representations.

pdf bib
单项形容词定语分布考察及“的”字隐现研究(Study on Distribution of Single Item Adjective Attributives and Appearance and Disappearance of “de”)
Rui Song (宋锐) | Zhimin Wang (王治敏)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“本文以2019-2021年《人民日报》文章中单项形容词定语77845个词例为研究对象,从实用性的角度考察了粘合式与组合式定语词例的分布特征、音节组配模式及“的”字的隐现倾向性。通过研究我们发现,粘合式定语的词例数量明显少于组合式定语词例数量,但使用频数却高出组合式定语的4-5倍。两种定语结构中,形容词和名词重复使用的比例很高,但其共现组合的比例偏少,同时,真实文本中“的”字的隐现具有“两极分化”的特征,绝大部分词例在使用过程中带“的”或不带“的”都具有很强的倾向性,“的”字出现具有区分词义和突显信息的作用,“的”字隐藏能促使语义更加凝练,进一步固化句式结构,使得某些句式形成了特指或隐喻的表达方式。本文为形容词定语结构的词汇语义研究提供依据和参考。”

2020

pdf bib
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning
Ye Liu | Sheng Zhang | Rui Song | Suo Feng | Yanghua Xiao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Open attribute value extraction for emerging entities is an important but challenging task. A lot of previous works formulate the problem as a question-answering (QA) task. While the collections of articles from web corpus provide updated information about the emerging entities, the retrieved texts can be noisy, irrelevant, thus leading to inaccurate answers. Effectively filtering out noisy articles as well as bad answers is the key to improve extraction accuracy. Knowledge graph (KG), which contains rich, well organized information about entities, provides a good resource to address the challenge. In this work, we propose a knowledge-guided reinforcement learning (RL) framework for open attribute value extraction. Informed by relevant knowledge in KG, we trained a deep Q-network to sequentially compare extracted answers to improve extraction accuracy. The proposed framework is applicable to different information extraction system. Our experimental results show that our method outperforms the baselines by 16.5 - 27.8%.

2018

pdf bib
The USTC-NEL Speech Translation system at IWSLT 2018
Dan Liu | Junhua Liu | Wu Guo | Shifu Xiong | Zhiqiang Ma | Rui Song | Chongliang Wu | Quan Liu
Proceedings of the 15th International Conference on Spoken Language Translation

This paper describes the USTC-NEL (short for ”National Engineering Laboratory for Speech and Language Information Processing University of science and technology of china”) system to the speech translation task of the IWSLT Evaluation 2018. The system is a conventional pipeline system which contains 3 modules: speech recognition, post-processing and machine translation. We train a group of hybrid-HMM models for our speech recognition, and for machine translation we train transformer based neural machine translation models with speech recognition output style text as input. Experiments conducted on the IWSLT 2018 task indicate that, compared to baseline system from KIT, our system achieved 14.9 BLEU improvement.