Yi Zhao


2024

pdf bib
FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning
Aaron Xuxiang Tian | Yi Zhao | Congrui Yin | Wei Zhu | Xing Tian | Yi Ge
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

Full-parameter fine-tuning is computationally prohibitive for large language models (LLMs), making parameter-efficient fine-tuning (PEFT) methods like low-rank adaptation (LoRA) increasingly popular. However, LoRA and its existing variants introduce significant latency in multi-tenant settings, hindering their applications in the industry. To address this issue, we propose the Fantastic LoRA (FanLoRA) framework, which consists of four steps: (a) adding LoRA modules to all the Transformer linear weights and fine-tuning on a large-scale instruction tuning dataset. (b) The importance of each module is then assessed using a novel importance scoring method. (c) only the most critical modules per layer are retained, resulting in the FanLoRA setting. (d) The FanLoRA setting is applied to fine-tune various downstream tasks. Our extensive experiments demonstrate that: (a) FanLoRA outperforms existing PEFT baselines across a wide collection of tasks with comparable tunable parameters. (b) FanLoRA significantly reduces the inference latency of LoRA, making it valuable for further broadening the applications of LLMs in the industry.

pdf bib
PARA: Parameter-Efficient Fine-tuning with Prompt-Aware Representation Adjustment
Zequan Liu | Yi Zhao | Ming Tan | Wei Zhu | Aaron Xuxiang Tian
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

In the realm of parameter-efficient fine-tuning (PEFT) methods, while options like LoRA are available, there is a persistent demand in the industry for a PEFT approach that excels in both efficiency and performance within the context of single-backbone multi-tenant applications. This paper introduces a new and straightforward PEFT technique, termed Prompt Aware Representation Adjustment (PARA). The core of our proposal is to integrate a lightweight vector generator within each Transformer layer. This generator produces vectors that are responsive to input prompts, thereby adjusting the hidden representations accordingly. Our extensive experimentation across diverse tasks has yielded promising results. Firstly, the PARA method has been shown to surpass current PEFT benchmarks in terms of performance, despite having a similar number of adjustable parameters. Secondly, it has proven to be more efficient than LoRA in the single-backbone multi-tenant scenario, highlighting its significant potential for industrial adoption.

pdf bib
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning
Jingfan Zhang | Yi Zhao | Dan Chen | Xing Tian | Huanran Zheng | Wei Zhu
Findings of the Association for Computational Linguistics: EMNLP 2024

Low-rank adaptation (LoRA) and its mixture-of-experts (MOE) variants are highly effective parameter-efficient fine-tuning (PEFT) methods. However, they introduce significant latency in multi-tenant settings due to the LoRA modules and MOE routers added to multiple linear modules in the Transformer layer. To address this issue, we propose Mixture of Low-Rank Adaptation (MiLoRA), a novel and efficient LoRA variant. MiLoRA differs from previous MOE-style LoRA methods by considering each LoRA module as an expert and employing a prompt-aware routing mechanism. This mechanism calculates expert routing results once before generating the first new token and reuses these results for subsequent tokens, reducing latency. Extensive experiments and analysis on commonsense reasoning tasks, math reasoning tasks, and widely used LLM evaluation benchmarks demonstrate that MiLoRA consistently outperforms strong PEFT baselines with comparable tunable parameter budgets. Additionally, MiLoRA significantly reduces latency in multi-tenant settings compared to previous LoRA-based methods.

pdf bib
RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict
Yirong Zeng | Xiao Ding | Yi Zhao | Xiangyu Li | Jie Zhang | Chao Yao | Ting Liu | Bing Qin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Fact-checking is the task of verifying the factuality of a given claim by examining the available evidence. High-quality evidence plays a vital role in enhancing fact-checking systems and facilitating the generation of explanations that are understandable to humans. However, the provision of both sufficient and relevant evidence for explainable fact-checking systems poses a challenge. To tackle this challenge, we propose a method based on a Large Language Model to automatically retrieve and summarize evidence from the Web. Furthermore, we construct RU22Fact, a novel multilingual explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples, each containing real-world claims, optimized evidence, and referenced explanation. To establish a baseline for our dataset, we also develop an end-to-end explainable fact-checking system to verify claims and generate explanations. Experimental results demonstrate the prospect of optimized evidence in increasing fact-checking performance and also indicate the possibility of further progress in the end-to-end claim verification and explanation generation tasks.

2022

pdf bib
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition
Guimin Hu | Ting-En Lin | Yi Zhao | Guangming Lu | Yuchuan Wu | Yongbin Li
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors. From a psychological perspective, emotions are the expression of affect or feelings during a short period, while sentiments are formed and held for a longer period. However, most existing works study sentiment and emotion separately and do not fully exploit the complementary knowledge behind the two. In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. We perform modality fusion at the syntactic and semantic levels and introduce contrastive learning between modalities and samples to better capture the difference and consistency between sentiments and emotions. Experiments on four public benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the effectiveness of the proposed method and achieve consistent improvements compared with state-of-the-art methods.

2021

pdf bib
Bidirectional Hierarchical Attention Networks based on Document-level Context for Emotion Cause Extraction
Guimin Hu | Guangming Lu | Yi Zhao
Findings of the Association for Computational Linguistics: EMNLP 2021

Emotion cause extraction (ECE) aims to extract the causes behind the certain emotion in text. Some works related to the ECE task have been published and attracted lots of attention in recent years. However, these methods neglect two major issues: 1) pay few attentions to the effect of document-level context information on ECE, and 2) lack of sufficient exploration for how to effectively use the annotated emotion clause. For the first issue, we propose a bidirectional hierarchical attention network (BHA) corresponding to the specified candidate cause clause to capture the document-level context in a structured and dynamic manner. For the second issue, we design an emotional filtering module (EF) for each layer of the graph attention network, which calculates a gate score based on the emotion clause to filter the irrelevant information. Combining the BHA and EF, the EF-BHA can dynamically aggregate the contextual information from two directions and filters irrelevant information. The experimental results demonstrate that EF-BHA achieves the competitive performances on two public datasets in different languages (Chinese and English). Moreover, we quantify the effect of context on emotion cause extraction and provide the visualization of the interactions between candidate cause clauses and contexts.