Jinglei Chen
2024
SEGMENT+: Long Text Processing with Short-Context Language Models
Wei Shi
|
Shuang Li
|
Kerun Yu
|
Jinglei Chen
|
Zujie Liang
|
Xinhui Wu
|
Yuxi Qian
|
Feng Wei
|
Bo Zheng
|
Jiaqing Liang
|
Jiangjie Chen
|
Yanghua Xiao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
There is a growing interest in expanding the input capacity of language models (LMs) across various domains. However, simply increasing the context window does not guarantee robust performance across diverse long-input processing tasks, such as understanding extensive documents and extracting detailed information from lengthy and noisy data. In response, we introduce Segment+, a general framework that enables LMs to handle extended inputs within limited context windows efficiently. Segment+ utilizes structured notes and a filtering module to manage information flow, resulting in a system that is both controllable and interpretable. Our extensive experiments across various model sizes, focusing on long-document question-answering and Needle-in-a-Haystack tasks, demonstrate the effectiveness of Segment+ in improving performance.
CR-LLM: A Dataset and Optimization for Concept Reasoning of Large Language Models
Nianqi Li
|
Jingping Liu
|
Sihang Jiang
|
Haiyun Jiang
|
Yanghua Xiao
|
Jiaqing Liang
|
Zujie Liang
|
Feng Wei
|
Jinglei Chen
|
Zhenghong Hao
|
Bing Han
Findings of the Association for Computational Linguistics: ACL 2024
Concept reasoning is an important capability for models to understand the world. However, the existing datasets, such as concept extraction and concept generation, suffer from modeledge leakage and context leakage. To address these limitations, we construct a dataset of concept reasoning for large language models (CR-LLM) with modeledge leakage prevention and context leakage prevention, which consists of 2,167 samples and covers different concept types. In addition, we propose a hybrid reasoning method, consisting of inductive reasoning, deductive reasoning and a controller. This method allows large language models to adaptively select the optimal reasoning method for each input sample. Finally, we conduct extensive experiments on CR-LLM using different models and methods. The results show that existing large language models and reasoning methods perform sub-optimally in the concept reasoning task. In contrast, our proposed method significantly improves the capabilities, achieving a 7% increase in accuracy compared to CoT and demonstrating better granularity. We release CR-LLM and code at https://github.com/Nianqi-Li/Concept-Reasoning-for-LLMs.
Search
Co-authors
- Zujie Liang 2
- Feng Wei 2
- Jiaqing Liang 2
- Yanghua Xiao 2
- Wei Shi 1
- show all...