Aakash Mahalingam
2025
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Aakash Mahalingam
|
Vinesh Kumar Gande
|
Aman Chadha
|
Vinija Jain
|
Divya Chaudhary
Proceedings of the Workshop on Generative AI and Knowledge Graphs (GenAIK)
This paper discusses about the SKETCH approach which enhances text retrieval and context relevancy on large corpuses compared to the traditional baseline methods. The abstract attached below discusses this further. Abstract: Retrieval-Augmented Generation (RAG) systems have become pivotal in leveraging vast corpora to generate informed and contextually relevant responses, notably reducing hallucinations in Large Language Models. Despite significant advancements, these systems struggle to efficiently process and retrieve information from large datasets while maintaining a comprehensive understanding of the context. This paper introduces SKETCH, a novel methodology that enhances the RAG retrieval process by integrating semantic text retrieval with knowledge graphs, thereby merging structured and unstructured data for a more holistic comprehension. SKETCH, demonstrates substantial improvements in retrieval performance and maintains superior context integrity compared to traditional methods. Evaluated across four diverse datasets: QuALITY, QASPER, NarrativeQA, and Italian Cuisine—SKETCH consistently outperforms baseline approaches on key RAGAS metrics such as answer relevancy, faithfulness, context precision and context recall. Notably, on the Italian Cuisine dataset, SKETCH achieved an answer relevancy of 0.94 and a context precision of 0.99, representing the highest performance across all evaluated metrics. These results highlight SKETCH’s capability in delivering more accurate and contextually relevant responses, setting new benchmarks for future retrieval systems.
Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization
Sahil Wadhwa
|
Chengtian Xu
|
Haoming Chen
|
Aakash Mahalingam
|
Akankshya Kar
|
Divya Chaudhary
Proceedings of the First Workshop on Multilingual Counterspeech Generation
The automatic generation of counter-speech (CS) is a critical strategy for addressing hate speech by providing constructive and informed responses. However, existing methods often fail to generate high-quality, impactful, and scalable CS, particularly across diverse lin- guistic contexts. In this paper, we propose a novel methodology to enhance CS generation by aligning Large Language Models (LLMs) using Supervised Fine-Tuning (SFT) and Di- rect Preference Optimization (DPO). Our ap- proach leverages DPO to align LLM outputs with human preferences, ensuring contextu- ally appropriate and linguistically adaptable responses. Additionally, we incorporate knowl- edge grounding to enhance the factual accuracy and relevance of generated CS. Experimental results demonstrate that DPO-aligned models significantly outperform SFT baselines on CS benchmarks while scaling effectively to mul- tiple languages. These findings highlight the potential of preference-based alignment tech- niques to advance CS generation across var- ied linguistic settings. The model supervision and alignment is done in English and the same model is used for reporting metrics across other languages like Basque, Italian, and Spanish.
Search
Fix data
Co-authors
- Divya Chaudhary 2
- Aman Chadha 1
- Haoming Chen 1
- Vinesh Kumar Gande 1
- Vinija Jain 1
- show all...