RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Xuanwang Zhang; Yun-Ze Song; Yidong Wang; Shuyun Tang; Xinfeng Li; Zhengran Zeng; Zhen Wu; Wei Ye; Wenyuan Xu; Yue Zhang; Xinyu Dai; Shikun Zhang; Qingsong Wen

doi:10.18653/v1/2024.emnlp-demo.43

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Xuanwang Zhang, Yun-Ze Song, Yidong Wang, Shuyun Tang, Xinfeng Li, Zhengran Zeng, Zhen Wu, Wei Ye, Wenyuan Xu, Yue Zhang, Xinyu Dai, Shikun Zhang, Qingsong Wen

Abstract

Large Language Models (LLMs) demonstrate human-level capabilities in dialogue, reasoning, and knowledge retention. However, even the most advanced LLMs face challenges such as hallucinations and real-time updating of their knowledge. Current research addresses this bottleneck by equipping LLMs with external knowledge, a technique known as Retrieval Augmented Generation (RAG). However, two key issues constrained the development of RAG. First, there is a growing lack of comprehensive and fair comparisons between novel RAG algorithms. Second, open-source tools such as LlamaIndex and LangChain employ high-level abstractions, which results in a lack of transparency and limits the ability to develop novel algorithms and evaluation metrics. To close this gap, we introduce RAGLAB, a modular and research-oriented open-source library. RAGLAB reproduces 6 existing algorithms and provides a comprehensive ecosystem for investigating RAG algorithms. Leveraging RAGLAB, we conduct a fair comparison of 6 RAG algorithms across 10 benchmarks. With RAGLAB, researchers can efficiently compare the performance of various algorithms and develop novel algorithms.

Anthology ID:: 2024.emnlp-demo.43
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Delia Irazu Hernandez Farias, Tom Hope, Manling Li
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 408–418
Language:
URL:: https://aclanthology.org/2024.emnlp-demo.43/
DOI:: 10.18653/v1/2024.emnlp-demo.43
Bibkey:
Cite (ACL):: Xuanwang Zhang, Yun-Ze Song, Yidong Wang, Shuyun Tang, Xinfeng Li, Zhengran Zeng, Zhen Wu, Wei Ye, Wenyuan Xu, Yue Zhang, Xinyu Dai, Shikun Zhang, and Qingsong Wen. 2024. RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 408–418, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation (Zhang et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-demo.43.pdf

PDF Cite Search Fix data