Youngjoong Ko

2025

ECO Decoding: Entropy-Based Control for Controllability and Fluency in Controllable Dialogue Generation
Seungmin Shin | Dooyoung Kim | Youngjoong Ko
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Controllable Dialogue Generation (CDG) enables chatbots to generate responses with desired attributes, and weighted decoding methods have achieved significant success in the CDG task. However, using a fixed constant value to manage the bias of attribute probabilities makes it challenging to find an ideal control strength that satisfies both controllability and fluency. To address this issue, we propose ECO decoding Entropy-based COntrol, which dynamically adjusts the control strength at each generation step according to the model’s entropy in both the language model and attribute classifier probability distributions. Experimental results on DailyDialog and MultiWOZ datasets show that our method achieves improved control accuracy while maintaining fluency and grammar, outperforming previous decoding methods across various models and settings. Furthermore, ECO decoding alleviates probability interpolation issues in multi-attribute generation, demonstrating its robust performance in both single and multi-attribute scenarios.

pdf bib abs

DAPI: Domain Adaptive Toxicity Probe Vector Intervention, for Fine-Grained Detoxification
Cho Hyeonsu | Dooyoung Kim | Youngjoong Ko
Findings of the Association for Computational Linguistics: ACL 2025

There have been attempts to utilize linear probe for detoxification, with existing studies relying on a single toxicity probe vector to reduce toxicity. However, toxicity can be fine-grained into various subcategories, making it difficult to remove certain types of toxicity by using a single toxicity probe vector. To address this limitation, we propose a category-specific toxicity probe vector approach. First, we train multiple toxicity probe vectors for different toxicity categories. During generation, we dynamically select the most relevant toxicity probe vector based on the current context. Finally, the selected vector is dynamically scaled and subtracted from model. Our method successfully mitigated toxicity from categories that the single probe vector approach failed to detoxify. Experiments demonstrate that our approach achieves up to a 78.52% reduction in toxicity on the evaluation dataset, while fluency remains nearly unchanged, with only a 0.052% drop compared to the unsteered model.

pdf bib abs

Decoding Dense Embeddings: Sparse Autoencoders for Interpreting and Discretizing Dense Retrieval
Seongwan Park | Taeklim Kim | Youngjoong Ko
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Despite their strong performance, Dense Passage Retrieval (DPR) models suffer from a lackof interpretability. In this work, we propose a novel interpretability framework that leveragesSparse Autoencoders (SAEs) to decompose previously uninterpretable dense embeddings fromDPR models into distinct, interpretable latent concepts. We generate natural language descriptionsfor each latent concept, enabling human interpretations of both the dense embeddingsand the query-document similarity scores of DPR models. We further introduce Concept-Level Sparse Retrieval (CL-SR), a retrieval framework that directly utilizes the extractedlatent concepts as indexing units. CL-SR effectively combines the semantic expressiveness ofdense embeddings with the transparency and efficiency of sparse representations. We showthat CL-SR achieves high index-space and computational efficiency while maintaining robustperformance across vocabulary and semantic mismatches.

2024

pdf bib abs

Hyper-QKSG: Framework for Automating Query Generation and Knowledge-Snippet Extraction from Tables and Lists
Dooyoung Kim | Yoonjin Jang | Dongwook Shin | Chanhoon Park | Youngjoong Ko
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

These days, there is an increasing necessity to provide a user with a short knowledge-snippet for a query in commercial information retrieval services such as the featured snippet of Google. In this paper, we focus on how to automatically extract the candidates of query-knowledge snippet pairs from structured HTML documents by using a new Language Model (HTML-PLM). In particular, the proposed system is powerful on extracting them from Tables and Lists, and provides a new framework for automate query generation and knowledge-snippet extraction based on a QA-pair filtering procedure including the snippet refinement and verification processes, which enhance the quality of generated query-knowledge snippet pairs. As a result, 53.8% of the generated knowledge-snippets includes complex HTML structures such as tables and lists in our experiments of a real-world environments, and 66.5% of the knowledge-snippets are evaluated as valid.

pdf bib abs

RAC: Retrieval-augmented Conversation Dataset for Open-domain Question Answering in Conversational Settings
Bonggeun Choi | Jeongjae Park | Yoonsung Kim | Jae-Hyun Park | Youngjoong Ko
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

In recent years, significant advancements in conversational question and answering (CQA) have been driven by the exponential growth of large language models and the integration of retrieval mechanisms that leverage external knowledge to generate accurate and contextually relevant responses. Consequently, the fields of conversational search and retrieval-augmented generation (RAG) have obtained substantial attention for their capacity to address two key challenges: query rewriting within conversational histories for better retrieval performance and generating responses by employing retrieved knowledge. However, both fields are often independently studied, and comprehensive study on entire systems remains underexplored. In this work, we present a novel retrieval-augmented conversation (RAC) dataset and develop a baseline system comprising query rewriting, retrieval, reranking, and response generation stages. Experimental results demonstrate the competitiveness of the system and extensive analyses are conducted to apprehend the impact of retrieval results to response generation.

2023

pdf bib abs

The Situated Interactive MultiModal Conversations (SIMMC2.1) Challenge 2022 is hosted by the Eleventh Dialog System Technology Challenge (DSTC11). This is the third consecutive year multimodal dialog systems have been selected as an official track of the competition, promoted by the continued interest in the research community. The task of SIMMC is to create a shopping assistant agent that can communicate with customers in a virtual store. It requires processing store scenes and product catalogs along with the customer’s request. The task is decomposed into four steps and each becomes a subtask. In this work, we explore the common approaches to modeling multimodality and find the method with the most potential. We also identify a discrepancy in using pretrained language models for dialog tasks and devise a simple domain-adaptation method. Our model came in third place for object coreferencing, dialog state tracking, and response generation tasks.

pdf bib abs

Topic-Informed Dialogue Summarization using Topic Distribution and Prompt-based Modeling
Jaeah You | Youngjoong Ko
Findings of the Association for Computational Linguistics: EMNLP 2023

Dealing with multiple topics should be considered an important issue in dialogue summarization, because dialogues, unlike documents, are prone to topic drift. Thus, we propose a new dialogue summarization model that reflects dialogue topic distribution to consider all topics present in the dialogue. First, the distribution of dialogue topics is estimated by an effective topic discovery model. Then topic-informed prompt transfers estimated topic distribution information to the output of encoder and decoder vectors. Finally, the topic extractor estimates the summary topic distribution from the output context vector of decoder to distinguish its difference from the dialogue topic distribution. To consider the proportion of each topic distribution appeared in the dialogue, the extractor is trained to reduce the difference between the distributions of the dialogue and the summary. The experimental results on SAMSum and DialogSum show that our model outperforms state-of-the-art methods on ROUGE scores. The human evaluation results also show that our framework well generates comprehensive summaries.

2021

pdf bib abs

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems
Janghoon Han | Taesuk Hong | Byoungjae Kim | Youngjoong Ko | Jungyun Seo
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Retrieval-based dialogue systems display an outstanding performance when pre-trained language models are used, which includes bidirectional encoder representations from transformers (BERT). During the multi-turn response selection, BERT focuses on training the relationship between the context with multiple utterances and the response. However, this method of training is insufficient when considering the relations between each utterance in the context. This leads to a problem of not completely understanding the context flow that is required to select a response. To address this issue, we propose a new fine-grained post-training method that reflects the characteristics of the multi-turn dialogue. Specifically, the model learns the utterance level interactions by training every short context-response pair in a dialogue session. Furthermore, by using a new training objective, the utterance relevance classification, the model understands the semantic relevance and coherence between the dialogue utterances. Experimental results show that our model achieves new state-of-the-art with significant margins on three benchmark datasets. This suggests that the fine-grained post-training method is highly effective for the response selection task.

pdf bib abs

Graph-based Fake News Detection using a Summarization Technique
Gihwan Kim | Youngjoong Ko
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Nowadays, fake news is spreading in various ways, and this fake information is causing a lot of social damages. Thus the need to detect fake information is increasing to prevent the damages caused by fake news. In this paper, we propose a novel graph-based fake news detection method using a summarization technique that uses only the document internal information. Our proposed method represents the relationship between all sentences using a graph and the reflection rate of contextual information among sentences is computed by using an attention mechanism. In addition, we improve the performance of fake news detection by utilizing summary information as an important subject of the document. The experimental results demonstrate that our method achieves high accuracy, 91.04%, that is 8.85%p better than the previous method.

2020

pdf bib abs

Multi-Task Learning for Knowledge Graph Completion with Pre-trained Language Models
Bosung Kim | Taesuk Hong | Youngjoong Ko | Jungyun Seo
Proceedings of the 28th International Conference on Computational Linguistics

As research on utilizing human knowledge in natural language processing has attracted considerable attention in recent years, knowledge graph (KG) completion has come into the spotlight. Recently, a new knowledge graph completion method using a pre-trained language model, such as KG-BERT, is presented and showed high performance. However, its scores in ranking metrics such as Hits@k are still behind state-of-the-art models. We claim that there are two main reasons: 1) failure in sufficiently learning relational information in knowledge graphs, and 2) difficulty in picking out the correct answer from lexically similar candidates. In this paper, we propose an effective multi-task learning method to overcome the limitations of previous works. By combining relation prediction and relevance ranking tasks with our target link prediction, the proposed model can learn more relational properties in KGs and properly perform even when lexical similarity occurs. Experimental results show that we not only largely improve the ranking performances compared to KG-BERT but also achieve the state-of-the-art performances in Mean Rank and Hits@10 on the WN18RR dataset.

2018

pdf bib abs

Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph
Dongsuk O | Sunjae Kwon | Kyungsun Kim | Youngjoong Ko
Proceedings of the 27th International Conference on Computational Linguistics

Word sense disambiguation (WSD) is the task to determine the word sense according to its context. Many existing WSD studies have been using an external knowledge-based unsupervised approach because it has fewer word set constraints than supervised approaches requiring training data. In this paper, we propose a new WSD method to generate the context of an ambiguous word by using similarities between an ambiguous word and words in the input document. In addition, to leverage our WSD method, we further propose a new word similarity calculation method based on the semantic network structure of BabelNet. We evaluate the proposed methods on the SemEval-13 and SemEval-15 for English WSD dataset. Experimental results demonstrate that the proposed WSD method significantly improves the baseline WSD method. Furthermore, our WSD system outperforms the state-of-the-art WSD systems in the Semeval-13 dataset. Finally, it has higher performance than the state-of-the-art unsupervised knowledge-based WSD system in the average performance of both datasets.

2017

pdf bib abs

A Method to Generate a Machine-Labeled Data for Biomedical Named Entity Recognition with Various Sub-Domains
Juae Kim | Sunjae Kwon | Youngjoong Ko | Jungyun Seo
Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)

Biomedical Named Entity (NE) recognition is a core technique for various works in the biomedical domain. In previous studies, using machine learning algorithm shows better performance than dictionary-based and rule-based approaches because there are too many terminological variations of biomedical NEs and new biomedical NEs are constantly generated. To achieve the high performance with a machine-learning algorithm, good-quality corpora are required. However, it is difficult to obtain the good-quality corpora because an-notating a biomedical corpus for ma-chine-learning is extremely time-consuming and costly. In addition, most previous corpora are insufficient for high-level tasks because they cannot cover various domains. Therefore, we propose a method for generating a large amount of machine-labeled data that covers various domains. To generate a large amount of machine-labeled data, firstly we generate an initial machine-labeled data by using a chunker and MetaMap. The chunker is developed to extract only biomedical NEs with manually annotated data. MetaMap is used to annotate the category of bio-medical NE. Then we apply the self-training approach to bootstrap the performance of initial machine-labeled data. In our experiments, the biomedical NE recognition system that is trained with our proposed machine-labeled data achieves much high performance. As a result, our system outperforms biomedical NE recognition system that using MetaMap only with 26.03%p improvements on F1-score.

Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques
Seon Yang | Youngjoong Ko
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers