Donghyeon Jeon

Also published as: DongHyeon Jeon


2024

pdf bib
SLM as Guardian: Pioneering AI Safety with Small Language Model
Ohjoon Kwon | Donghyeon Jeon | Nayoung Choi | Gyu-Hwung Cho | Hwiyeol Jo | Changbong Kim | Hyunwoo Lee | Inho Kang | Sun Kim | Taiwoo Park
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

Most prior safety research of large language models (LLMs) has focused on enhancing the alignment of LLMs to better suit the safety requirements of their use cases. However, internalizing such safeguard features into larger models brought challenges of higher training cost and unintended degradation of helpfulness. In this paper, we leverage a smaller LLM for both harmful query detection and safeguard response generation. We introduce our safety requirements and the taxonomy of harmfulness categories, and then propose a multi-task learning mechanism fusing the two tasks into a single model. We demonstrate the effectiveness of our approach, providing on par or surpassing harmful query detection and safeguard response performance compared to the publicly available LLMs.

pdf bib
RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion
Sung-Min Lee | Eunhwan Park | DongHyeon Jeon | Inho Kang | Seung-Hoon Na
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Large language models (LLMs) have demonstrated superior performance to that of small language models (SLM) in information retrieval for various subtasks including dense retrieval, reranking, query expansion, and pseudo-document generation. However, the parameter sizes of LLMs are extremely large, making it expensive to operate LLMs stably for providing LLM-based retrieval services. Recently, retrieval-augmented language models have been widely employed to significantly reduce the parameter size by retrieving relevant knowledge from large-scale corpora and exploiting the resulting “in-context” knowledge as additional model input, thereby substantially reducing the burden of internalizing and retaining world knowledge in model parameters. Armed by the retrieval-augmented language models, we present a retrieval-augmented model specialization that distills the capability of LLMs to generate the chain-of-thoughts (CoT) for query expansion – that is, injects the LLM’s capability to generate CoT into a retrieval-augmented SLM – referred to as RADCoT. Experimental results on the MS-MARCO, TREC DL 19, 20 datasets show that RADCoT yields consistent improvements over distillation without retrieval, achieving comparable performance to that of the query expansion method using LLM-based CoTs. Our code is publicly available at https://github.com/ZIZUN/RADCoT.

2023

pdf bib
MAFiD: Moving Average Equipped Fusion-in-Decoder for Question Answering over Tabular and Textual Data
Sung-Min Lee | Eunhwan Park | Daeryong Seo | Donghyeon Jeon | Inho Kang | Seung-Hoon Na
Findings of the Association for Computational Linguistics: EACL 2023

Transformer-based models for question answering (QA) over tables and texts confront a “long” hybrid sequence over tabular and textual elements, causing long-range reasoning problems. To handle long-range reasoning, we extensively employ a fusion-in-decoder (FiD) and exponential moving average (EMA), proposing a Moving Average Equipped Fusion-in-Decoder (MAFiD). With FiD as the backbone architecture, MAFiD combines various levels of reasoning: independent encoding of homogeneous data and single-row and multi-row heterogeneous reasoning, using a gated cross attention layer to effectively aggregate the three types of representations resulting from various reasonings. Experimental results on HybridQA indicate that MAFiD achieves state-of-the-art performance by increasing exact matching (EM) and F1 by 1.1 and 1.7, respectively, on the blind test set.

2022

pdf bib
LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory
Eunhwan Park | Donghyeon Jeon | Seonhoon Kim | Inho Kang | Seung-Hoon Na
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

LM-BFF (CITATION) achieves significant few-shot performance by using auto-generated prompts and adding demonstrations similar to an input example. To improve the approach of LM-BFF, this paper proposes LM-BFF-MSbetter few-shot fine-tuning of language models with multiple soft demonstrations by making its further extensions, which include 1) prompts with multiple demonstrations based on automatic generation of multiple label words; and 2) soft demonstration memory which consists of multiple sequences of globally shared word embeddings for a similar context. Experiments conducted on eight NLP tasks show that LM-BFF-MS leads to improvements over LM-BFF on five tasks, particularly achieving 94.0 and 90.4 on SST-2 and MRPC, respectively.

pdf bib
SISER: Semantic-Infused Selective Graph Reasoning for Fact Verification
Eunhwan Park | Jong-Hyeon Lee | DongHyeon Jeon | Seonhoon Kim | Inho Kang | Seung-Hoon Na
Proceedings of the 29th International Conference on Computational Linguistics

This study proposes Semantic-Infused SElective Graph Reasoning (SISER) for fact verification, which newly presents semantic-level graph reasoning and injects its reasoning-enhanced representation into other types of graph-based and sequence-based reasoning methods. SISER combines three reasoning types: 1) semantic-level graph reasoning, which uses a semantic graph from evidence sentences, whose nodes are elements of a triple – <Subject, Verb, Object>, 2) “semantic-infused” sentence-level “selective” graph reasoning, which combine semantic-level and sentence-level representations and perform graph reasoning in a selective manner using the node selection mechanism, and 3) sequence reasoning, which concatenates all evidence sentences and performs attention-based reasoning. Experiment results on a large-scale dataset for Fact Extraction and VERification (FEVER) show that SISER outperforms the previous graph-based approaches and achieves state-of-the-art performance.