Wenyang Luo


2024

pdf bib
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
Tianyi Tang | Wenyang Luo | Haoyang Huang | Dongdong Zhang | Xiaolei Wang | Xin Zhao | Furu Wei | Ji-Rong Wen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.It remains a challenging problem to explain the underlying mechanisms by which LLMs process multilingual texts.In this paper, we delve into the composition of Transformer architectures in LLMs to pinpoint language-specific regions.Specially, we propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.Based on LAPE, we conduct comprehensive experiments on several representative LLMs, such as LLaMA-2, BLOOM, and Mistral. Our findings indicate that LLMs’ proficiency in processing a particular language is predominantly due to a small subset of neurons, primarily situated in the models’ top and bottom layers.Furthermore, we showcase the feasibility to “steer” the output language of LLMs by selectively activating or deactivating language-specific neurons. Our research provides important evidence to the understanding and exploration of the multilingual capabilities of LLMs.

pdf bib
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang | Hu Yiwen | Bingqian Li | Wenyang Luo | ZiJing Qin | Haoxiang Sun | Jiapeng Wang | Shiyi Xu | Xiaoxue Cheng | Geyang Guo | Han Peng | Bowen Zheng | Yiru Tang | Yingqian Min | Yushuo Chen | Jie Chen | Ranchi Zhao | Luran Ding | Yuhao Wang | Zican Dong | Xia Chunxuan | Junyi Li | Kun Zhou | Xin Zhao | Ji-Rong Wen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets, and models, and (3) more practical consideration, especially on user-friendliness and efficiency. With our library, users can easily reproduce existing methods, train new models, and conduct comprehensive performance comparisons. To rigorously test LLMBox, we conduct extensive experiments in a diverse coverage of evaluation settings, and experimental results demonstrate the effectiveness and efficiency of our library in supporting various implementations related to LLMs. The detailed introduction and usage guidance can be found at https://github.com/RUCAIBox/LLMBox.