Hakan Doğan
2025
Enhancing Regulatory Compliance Through Automated Retrieval, Reranking, and Answer Generation
Kübranur Umar
|
Hakan Doğan
|
Onur Özcan
|
İsmail Karakaya
|
Alper Karamanlıoğlu
|
Berkan Demirel
Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025)
This paper explains a Retrieval-Augmented Generation (RAG) pipeline that optimizes reg- ularity compliance using a combination of em- bedding models (i.e. bge-m3, jina-embeddings- v3, e5-large-v2) with reranker (i.e. bge- reranker-v2-m3). To efficiently process long context passages, we introduce context aware chunking method. By using the RePASS met- ric, we ensure comprehensive coverage of obli- gations and minimizes contradictions, thereby setting a new benchmark for RAG-based regu- latory compliance systems. The experimen- tal results show that our best configuration achieves a score of 0.79 in Recall@10 and 0.66 in MAP@10 with LLaMA-3.1-8B model for answer generation.