Minbyul Jeong


2024

pdf bib
CompAct: Compressing Retrieved Documents Actively for Question Answering
Chanwoong Yoon | Taewhoo Lee | Hyeon Hwang | Minbyul Jeong | Jaewoo Kang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Retrieval-augmented generation supports language models to strengthen their factual groundings by providing external contexts. However, language models often face challenges when given extensive information, diminishing their effectiveness in solving questions. Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios where crucial information cannot be captured with a single-step approach. To overcome this limitation, we introduce CompAct, a novel framework that employs an active strategy to condense extensive documents without losing key information. Our experiments demonstrate that CompAct brings significant improvements in both performance and compression rate on multi-hop question-answering benchmarks. CompAct flexibly operates as a cost-efficient plug-in module with various off-the-shelf retrievers or readers, achieving exceptionally high compression rates (47x).

2020

pdf bib
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee | Sean S. Yi | Minbyul Jeong | Mujeen Sung | WonJin Yoon | Yonghwa Choi | Miyoung Ko | Jaewoo Kang
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

The recent outbreak of the novel coronavirus is wreaking havoc on the world and researchers are struggling to effectively combat it. One reason why the fight is difficult is due to the lack of information and knowledge. In this work, we outline our effort to contribute to shrinking this knowledge vacuum by creating covidAsk, a question answering (QA) system that combines biomedical text mining and QA techniques to provide answers to questions in real-time. Our system also leverages information retrieval (IR) approaches to provide entity-level answers that are complementary to QA models. Evaluation of covidAsk is carried out by using a manually created dataset called COVID-19 Questions which is based on information from various sources, including the CDC and the WHO. We hope our system will be able to aid researchers in their search for knowledge and information not only for COVID-19, but for future pandemics as well.