Yingqian Min
2024
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
|
Hu Yiwen
|
Bingqian Li
|
Wenyang Luo
|
ZiJing Qin
|
Haoxiang Sun
|
Jiapeng Wang
|
Shiyi Xu
|
Xiaoxue Cheng
|
Geyang Guo
|
Han Peng
|
Bowen Zheng
|
Yiru Tang
|
Yingqian Min
|
Yushuo Chen
|
Jie Chen
|
Ranchi Zhao
|
Luran Ding
|
Yuhao Wang
|
Zican Dong
|
Xia Chunxuan
|
Junyi Li
|
Kun Zhou
|
Xin Zhao
|
Ji-Rong Wen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets, and models, and (3) more practical consideration, especially on user-friendliness and efficiency. With our library, users can easily reproduce existing methods, train new models, and conduct comprehensive performance comparisons. To rigorously test LLMBox, we conduct extensive experiments in a diverse coverage of evaluation settings, and experimental results demonstrate the effectiveness and efficiency of our library in supporting various implementations related to LLMs. The detailed introduction and usage guidance can be found at https://github.com/RUCAIBox/LLMBox.
DATA-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning
Yingqian Min
|
Kun Zhou
|
Dawei Gao
|
Xin Zhao
|
He Hu
|
Yaliang Li
Findings of the Association for Computational Linguistics: ACL 2024
Recently, multi-task instruction tuning has been utilized to improve sentence representation learning (SRL). It enables SRL models to generate task-specific representations with the guidance of task instruction, thus exhibiting strong generalization ability on unseen tasks. However, these methods mostly neglect the potential interference problems across different tasks and instances, which may affect the training of the model.To address this issue, we propose a data curriculum method, namely **Data-CUBE**, that arranges the order of all the multi-task data for training, to minimize the interference risks from two aspects.At the task level, we aim to find the optimal task order to minimize the total cross-task interference risk and formulate this problem as the traveling salesman problem, which is further solved by a specially designed simulated annealing algorithm. At the instance level, we propose a measurement method to quantify the difficulty of all instances per task, and then arrange instances in an easy-to-difficult order for training.Experimental results show that our approach can boost the performance of state-of-the-art methods. Our code and data will be publicly released.
Search
Fix data
Co-authors
- Wayne Xin Zhao 2
- Kun Zhou 2
- Yushuo Chen 1
- Jie Chen 1
- Xiaoxue Cheng 1
- show all...