Vinh Van Nguyen

Also published as: Vinh-Van Nguyen, Vinh Van Nguyen

2025

RoSRL: Adaptive Rule-of-Sum Reinforcement Learning for Efficient and Reliable Summarization
Thu Phuong Tran Thi | Vinh Van Nguyen | Thai Nguyen Phuong | Quang Vu Ngoc | Khoa Nguyen Dang
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation

pdf bib abs

KG-CQR: Leveraging Structured Relation Representations in Knowledge Graphs for Contextual Query Retrieval
Chi Minh Bui | Ngoc Mai Thieu | Vinh Van Nguyen | Jason J. Jung | Khac-Hoai Nam Bui
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

The integration of knowledge graphs (KGs) with large language models (LLMs) offers significant potential to enhance the retrieval stage in retrieval-augmented generation (RAG) systems. In this study, we propose KG-CQR, a novel framework for Contextual Query Retrieval (CQR) that enhances the retrieval phase by enriching complex input queries with contextual representations derived from a corpus-centric KG. Unlike existing methods that primarily address corpus-level context loss, KG-CQR focuses on query enrichment through structured relation representations, extracting and completing relevant KG subgraphs to generate semantically rich query contexts. Comprising subgraph extraction, completion, and contextual generation modules, KG-CQR operates as a model-agnostic pipeline, ensuring scalability across LLMs of varying sizes without additional training. Experimental results on the RAGBench and MultiHop-RAG datasets demonstrate that KG-CQR outperforms strong baselines, achieving improvements of up to 4–6% in mAP and approximately 2–3% in Recall@25. Furthermore, evaluations on challenging RAG tasks such as multi-hop question answering show that, by incorporating KG-CQR, the performance outperforms the existing baseline in terms of retrieval effectiveness.

2022

pdf bib abs

The multilingual parallel corpus is an important resource for many applications of natural language processing (NLP). For machine translation, the size and quality of the training corpus mainly affects the quality of the translation models. In this work, we present the method for building high-quality multilingual parallel corpus in the news domain and for some low-resource languages, including Vietnamese, Laos, and Khmer, to improve the quality of multilingual machine translation in these areas. We also publicized this one that includes 500.000 Vietnamese-Chinese bilingual sentence pairs; 150.000 Vietnamese-Laos bilingual sentence pairs, and 150.000 Vietnamese-Khmer bilingual sentence pairs.

Vinh Van Nguyen

2025

2022

2013

2012

2009

2008

Co-authors

Venues