Vu Tran


2024

pdf bib
Team ISM at CLPsych 2024: Extracting Evidence of Suicide Risk from Reddit Posts with Knowledge Self-Generation and Output Refinement using A Large Language Model
Vu Tran | Tomoko Matsui
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

This paper presents our approach to the CLPsych 2024 shared task: utilizing large language models (LLMs) for finding supporting evidence about an individual’s suicide risk level in Reddit posts. Our framework is constructed around an LLM with knowledge self-generation and output refinement. The knowledge self-generation process produces task-related knowledge which is generated by the LLM and leads to accurate risk predictions. The output refinement process, later, with the selected best set of LLM-generated knowledge, refines the outputs by prompting the LLM repeatedly with different knowledge instances interchangeably. We achieved highly competitive results comparing to the top-performance participants with our official recall of 93.5%, recall–precision harmonic-mean of 92.3%, and mean consistency of 96.1%.

2023

pdf bib
CovRelex-SE: Adding Semantic Information for Relation Search via Sequence Embedding
Truong Do | Chau Nguyen | Vu Tran | Ken Satoh | Yuji Matsumoto | Minh Nguyen
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In recent years, COVID-19 has impacted all aspects of human life. As a result, numerous publications relating to this disease have been issued. Due to the massive volume of publications, some retrieval systems have been developed to provide researchers with useful information. In these systems, lexical searching methods are widely used, which raises many issues related to acronyms, synonyms, and rare keywords. In this paper, we present a hybrid relation retrieval system, CovRelex-SE, based on embeddings to provide high-quality search results. Our system can be accessed through the following URL: https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex-se/

2021

pdf bib
CovRelex: A COVID-19 Retrieval System with Relation Extraction
Vu Tran | Van-Hien Tran | Phuong Nguyen | Chau Nguyen | Ken Satoh | Yuji Matsumoto | Minh Nguyen
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

This paper presents CovRelex, a scientific paper retrieval system targeting entities and relations via relation extraction on COVID-19 scientific papers. This work aims at building a system supporting users efficiently in acquiring knowledge across a huge number of COVID-19 scientific papers published rapidly. Our system can be accessed via https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex/.

2020

pdf bib
How State-Of-The-Art Models Can Deal With Long-Form Question Answering
Minh-Quan Bui | Vu Tran | Ha-Thanh Nguyen | Le-Minh Nguyen
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

pdf bib
Answering Legal Questions by Learning Neural Attentive Text Representation
Phi Manh Kien | Ha-Thanh Nguyen | Ngo Xuan Bach | Vu Tran | Minh Le Nguyen | Tu Minh Phuong
Proceedings of the 28th International Conference on Computational Linguistics

Text representation plays a vital role in retrieval-based question answering, especially in the legal domain where documents are usually long and complicated. The better the question and the legal documents are represented, the more accurate they are matched. In this paper, we focus on the task of answering legal questions at the article level. Given a legal question, the goal is to retrieve all the correct and valid legal articles, that can be used as the basic to answer the question. We present a retrieval-based model for the task by learning neural attentive text representation. Our text representation method first leverages convolutional neural networks to extract important information in a question and legal articles. Attention mechanisms are then used to represent the question and articles and select appropriate information to align them in a matching process. Experimental results on an annotated corpus consisting of 5,922 Vietnamese legal questions show that our model outperforms state-of-the-art retrieval-based methods for question answering by large margins in terms of both recall and NDCG.