MiniRAG: A Lightweight RAG system with Small Language Models

Tianyu Fan; Jingyuan Wang; Xubin Ren; Chao Huang

MiniRAG: A Lightweight RAG system with Small Language Models

Tianyu Fan, Jingyuan Wang, Xubin Ren, Chao Huang

Abstract

The growing demand for efficient and lightweight Retrieval-Augmented Generation (RAG) systems has highlighted significant challenges when deploying Small Language Models (SLMs) in existing RAG frameworks. Current approaches face severe performance degradation due to SLMs’ limited semantic understanding and text processing capabilities, creating barriers for widespread adoption in resource-constrained scenarios. To address these fundamental limitations, we present MiniRAG, a novel RAG system designed for simplicity and efficiency. MiniRAG introduces two key technical innovations: (1) a semantic-aware heterogeneous graph indexing mechanism that combines text chunks and named entities in a unified structure, reducing reliance on complex semantic understanding, and (2) a lightweight topology-enhanced retrieval approach that leverages graph structures for efficient knowledge discovery without requiring advanced language capabilities. Our extensive experiments demonstrate that MiniRAG achieves comparable performance to LLM-based methods even when using SLMs while requiring only 25% of the storage space. Additionally, we contribute a comprehensive benchmark dataset for evaluating lightweight RAG systems under realistic on-device scenarios with complex queries.

Anthology ID:: 2026.acl-long.1721
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 37121–37144
Language:
URL:: https://aclanthology.org/2026.acl-long.1721/
DOI:
Bibkey:
Cite (ACL):: Tianyu Fan, Jingyuan Wang, Xubin Ren, and Chao Huang. 2026. MiniRAG: A Lightweight RAG system with Small Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 37121–37144, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: MiniRAG: A Lightweight RAG system with Small Language Models (Fan et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1721.pdf
Checklist:: 2026.acl-long.1721.checklist.pdf

PDF Cite Search Checklist Fix data