Harksoo Kim


2024

pdf bib
Analyzing Key Factors Influencing Emotion Prediction Performance of VLLMs in Conversational Contexts
Jaewook Lee | Yeajin Jang | Hongjin Kim | Woojin Lee | Harksoo Kim
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Emotional intelligence (EI) in artificial intelligence (AI), which refers to the ability of an AI to understand and respond appropriately to human emotions, has emerged as a crucial research topic. Recent studies have shown that large language models (LLMs) and vision large language models (VLLMs) possess EI and the ability to understand emotional stimuli in the form of text and images, respectively. However, factors influencing the emotion prediction performance of VLLMs in real-world conversational contexts have not been sufficiently explored. This study aims to analyze the key elements affecting the emotion prediction performance of VLLMs in conversational contexts systematically. To achieve this, we reconstructed the MELD dataset, which is based on the popular TV series Friends, and conducted experiments through three sub-tasks: overall emotion tone prediction, character emotion prediction, and contextually appropriate emotion expression selection. We evaluated the performance differences based on various model architectures (e.g., image encoders, modality alignment, and LLMs) and image scopes (e.g., entire scene, person, and facial expression). In addition, we investigated the impact of providing persona information on the emotion prediction performance of the models and analyzed how personality traits and speaking styles influenced the emotion prediction process. We conducted an in-depth analysis of the impact of various other factors, such as gender and regional biases, on the emotion prediction performance of VLLMs. The results revealed that these factors significantly influenced the model performance.

pdf bib
Exploring Nested Named Entity Recognition with Large Language Models: Methods, Challenges, and Insights
Hongjin Kim | Jai-Eun Kim | Harksoo Kim
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Nested Named Entity Recognition (NER) poses a significant challenge in Natural Language Processing (NLP), demanding sophisticated techniques to identify entities within entities. This research investigates the application of Large Language Models (LLMs) to nested NER, exploring methodologies from prior work and introducing specific reasoning techniques and instructions to improve LLM efficacy. Through experiments conducted on the ACE 2004, ACE 2005, and GENIA datasets, we evaluate the impact of these approaches on nested NER performance. Results indicate that output format critically influences nested NER performance, methodologies from previous works are less effective, and our nested NER-tailored instructions significantly enhance performance. Additionally, we find that label information and descriptions of nested cases are crucial in eliciting the capabilities of LLMs for nested NER, especially in specific domains (i.e., the GENIA dataset). However, these methods still do not outperform BERT-based models, highlighting the ongoing need for innovative approaches in nested NER with LLMs.

pdf bib
Bridging the Code Gap: A Joint Learning Framework across Medical Coding Systems
Geunyeong Jeong | Seokwon Jeong | Juoh Sun | Harksoo Kim
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Automated Medical Coding (AMC) is the task of automatically converting free-text medical documents into predefined codes according to a specific medical coding system. Although deep learning has significantly advanced AMC, the class imbalance problem remains a significant challenge. To address this issue, most existing methods consider only a single coding system and disregard the potential benefits of reflecting the relevance between different coding systems. To bridge this gap, we introduce a Joint learning framework for Across Medical coding Systems (JAMS), which jointly learns different coding systems through multi-task learning. It learns various representations using a shared encoder and explicitly captures the relationships across these coding systems using the medical code attention network, a modification of the graph attention network. In the experiments on the MIMIC-IV ICD-9 and MIMIC-IV ICD-10 datasets, connected through General Equivalence Mappings, JAMS improved the performance consistently regardless of the backbone models. This result demonstrates its model-agnostic characteristic, which is not constrained by specific model structures. Notably, JAMS significantly improved the performance of low-frequency codes. Our analysis shows that these performance gains are due to the connections between the codes of the different coding systems.

pdf bib
Title-based Extractive Summarization via MRC Framework
Hongjin Kim | Jai-Eun Kim | Harksoo Kim
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Existing studies on extractive summarization have primarily focused on scoring and selecting summary sentences independently. However, these models are limited to sentence-level extraction and tend to select highly generalized sentences while overlooking the overall content of a document. To effectively consider the semantics of a document, in this study, we introduce a novel machine reading comprehension (MRC) framework for extractive summarization (MRCSum) by setting a query as the title. Our framework enables MRCSum to consider the semantic coherence and relevance of summary sentences in relation to the overall content. In particular, when a title is not available, we generate a title-like query, which is expected to achieve the same effect as a title. Our title-like query consists of the topic and keywords to serve as information on the main topic or theme of the document. We conduct experiments in both Korean and English languages, evaluating the performance of MRCSum on datasets comprising both long and short summaries. Our results demonstrate the effectiveness of MRCSum in extractive summarization, showcasing its ability to generate concise and informative summaries with or without explicit titles. Furthermore, our MRCSum outperforms existing models by capturing the essence of the document content and producing more coherent summaries.

2023

pdf bib
A Framework for Vision-Language Warm-up Tasks in Multimodal Dialogue Models
Jaewook Lee | Seongsik Park | Seong-Heum Park | Hongjin Kim | Harksoo Kim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Most research on multimodal open-domain dialogue agents has focused on pretraining and multi-task learning using additional rich datasets beyond a given target dataset. However, methods for exploiting these additional datasets can be quite limited in real-world settings, creating a need for more efficient methods for constructing agents based solely on the target dataset. To address these issues, we present a new learning strategy called vision-language warm-up tasks for multimodal dialogue models (VLAW-MDM). This strategy does not require the use of large pretraining or multi-task datasets but rather relies solely on learning from target data. Moreover, our proposed approach automatically generate captions for images and incorporate them into the model’s input to improve the contextualization of visual information. Using this novel approach, we empirically demonstrate that our learning strategy is effective for limited data and relatively small models. The result show that our method achieved comparable and in some cases superior performance compared to existing state-of-the-art models on various evaluation metrics.

pdf bib
Improving Automatic KCD Coding: Introducing the KoDAK and an Optimized Tokenization Method for Korean Clinical Documents
Geunyeong Jeong | Juoh Sun | Seokwon Jeong | Hyunjin Shin | Harksoo Kim
Proceedings of the 5th Clinical Natural Language Processing Workshop

International Classification of Diseases (ICD) coding is the task of assigning a patient’s electronic health records into standardized codes, which is crucial for enhancing medical services and reducing healthcare costs. In Korea, automatic Korean Standard Classification of Diseases (KCD) coding has been hindered by limited resources, differences in ICD systems, and language-specific characteristics. Therefore, we construct the Korean Dataset for Automatic KCD coding (KoDAK) by collecting and preprocessing Korean clinical documents. In addition, we propose a tokenization method optimized for Korean clinical documents. Our experiments show that our proposed method outperforms Korean Medical BERT (KM-BERT) in Macro-F1 performance by 0.14%p while using fewer model parameters, demonstrating its effectiveness in Korean clinical documents.

2022

pdf bib
Pipeline Coreference Resolution Model for Anaphoric Identity in Dialogues
Damrin Kim | Seongsik Park | Mirae Han | Harksoo Kim
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

CODI-CRAC 2022 Shared Task in Dialogues consists of three sub-tasks: Sub-task 1 is the resolution of anaphoric identity, sub-task 2 is the resolution of bridging references, and sub-task 3 is the resolution of discourse deixis/abstract anaphora. Anaphora resolution is the task of detecting mentions from input documents and clustering the mentions of the same entity. The end-to-end model proceeds with the pruning of the candidate mention, and the pruning has the possibility of removing the correct mention. Also, the end-to-end anaphora resolution model has high model complexity, which takes a long time to train. Therefore, we proceed with the anaphora resolution as a two-stage pipeline model. In the first mention detection step, the score of the candidate word span is calculated, and the mention is predicted without pruning. In the second anaphora resolution step, the pair of mentions of the anaphora resolution relationship is predicted using the mentions predicted in the mention detection step. We propose a two-stage anaphora resolution pipeline model that reduces model complexity and training time, and maintains similar performance to end-to-end models. As a result of the experiment, the anaphora resolution showed a performance of 68.27% in Light, 48.87% in AMI, 69.06% in Persuasion, and 60.99% on Switchboard. Our final system ranked 3rd on the leaderboard of sub-task 1.

2021

pdf bib
Deep Context- and Relation-Aware Learning for Aspect-based Sentiment Analysis
Shinhyeok Oh | Dongyub Lee | Taesun Whang | IlNam Park | Seo Gaeun | EungGyun Kim | Harksoo Kim
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Existing works for aspect-based sentiment analysis (ABSA) have adopted a unified approach, which allows the interactive relations among subtasks. However, we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level. In addition, identifying multiple aspect–opinion pairs with their polarities is much more challenging. Therefore, a comprehensive understanding of contextual information w.r.t. the aspect and opinion are further required in ABSA. In this paper, we propose Deep Contextualized Relation-Aware Network (DCRAN), which allows interactive relations among subtasks with deep contextual information based on two modules (i.e., Aspect and Opinion Propagation and Explicit Self-Supervised Strategies). Especially, we design novel self-supervised strategies for ABSA, which have strengths in dealing with multiple aspects. Experimental results show that DCRAN significantly outperforms previous state-of-the-art methods by large margins on three widely used benchmarks.

pdf bib
Document-Grounded Goal-Oriented Dialogue Systems on Pre-Trained Language Model with Diverse Input Representation
Boeun Kim | Dohaeng Lee | Sihyung Kim | Yejin Lee | Jin-Xia Huang | Oh-Woog Kwon | Harksoo Kim
Proceedings of the 1st Workshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc 2021)

Document-grounded goal-oriented dialog system understands users’ utterances, and generates proper responses by using information obtained from documents. The Dialdoc21 shared task consists of two subtasks; subtask1, finding text spans associated with users’ utterances from documents, and subtask2, generating responses based on information obtained from subtask1. In this paper, we propose two models (i.e., a knowledge span prediction model and a response generation model) for the subtask1 and the subtask2. In the subtask1, dialogue act losses are used with RoBERTa, and title embeddings are added to input representation of RoBERTa. In the subtask2, various special tokens and embeddings are added to input representation of BART’s encoder. Then, we propose a method to assign different difficulty scores to leverage curriculum learning. In the subtask1, our span prediction model achieved F1-scores of 74.81 (ranked at top 7) and 73.41 (ranked at top 5) in test-dev phase and test phase, respectively. In the subtask2, our response generation model achieved sacreBLEUs of 37.50 (ranked at top 3) and 41.06 (ranked at top 1) in in test-dev phase and test phase, respectively.

pdf bib
The Pipeline Model for Resolution of Anaphoric Reference and Resolution of Entity Reference
Hongjin Kim | Damrin Kim | Harksoo Kim
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

The objective of anaphora resolution in dialogue shared-task is to go above and beyond the simple cases of coreference resolution in written text on which NLP has mostly focused so far, which arguably overestimate the performance of current SOTA models. The anaphora resolution in dialogue shared-task consists of three subtasks; subtask1, resolution of anaphoric identity and non-referring expression identification, subtask2, resolution of bridging references, and subtask3, resolution of discourse deixis/abstract anaphora. In this paper, we propose the pipelined model (i.e., a resolution of anaphoric identity and a resolution of bridging references) for the subtask1 and the subtask2. In the subtask1, our model detects mention via the parentheses prediction. Then, we yield mention representation using the token representation constituting the mention. Mention representation is fed to the coreference resolution model for clustering. In the subtask2, our model resolves bridging references via the MRC framework. We construct query for each entity as “What is related of ENTITY?”. The input of our model is query and documents(i.e., all utterances of dialogue). Then, our model predicts entity span that is answer for query.

2019

pdf bib
ThisIsCompetition at SemEval-2019 Task 9: BERT is unstable for out-of-domain samples
Cheoneum Park | Juae Kim | Hyeon-gu Lee | Reinald Kim Amplayo | Harksoo Kim | Jungyun Seo | Changki Lee
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our system, Joint Encoders for Stable Suggestion Inference (JESSI), for the SemEval 2019 Task 9: Suggestion Mining from Online Reviews and Forums. JESSI is a combination of two sentence encoders: (a) one using multiple pre-trained word embeddings learned from log-bilinear regression (GloVe) and translation (CoVe) models, and (b) one on top of word encodings from a pre-trained deep bidirectional transformer (BERT). We include a domain adversarial training module when training for out-of-domain samples. Our experiments show that while BERT performs exceptionally well for in-domain samples, several runs of the model show that it is unstable for out-of-domain samples. The problem is mitigated tremendously by (1) combining BERT with a non-BERT encoder, and (2) using an RNN-based classifier on top of BERT. Our final models obtained second place with 77.78% F-Score on Subtask A (i.e. in-domain) and achieved an F-Score of 79.59% on Subtask B (i.e. out-of-domain), even without using any additional external data.

pdf bib
Relation Extraction among Multiple Entities Using a Dual Pointer Network with a Multi-Head Attention Mechanism
Seong Sik Park | Harksoo Kim
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

Many previous studies on relation extrac-tion have been focused on finding only one relation between two entities in a single sentence. However, we can easily find the fact that multiple entities exist in a single sentence and the entities form multiple relations. To resolve this prob-lem, we propose a relation extraction model based on a dual pointer network with a multi-head attention mechanism. The proposed model finds n-to-1 subject-object relations by using a forward de-coder called an object decoder. Then, it finds 1-to-n subject-object relations by using a backward decoder called a sub-ject decoder. In the experiments with the ACE-05 dataset and the NYT dataset, the proposed model achieved the state-of-the-art performances (F1-score of 80.5% in the ACE-05 dataset, F1-score of 78.3% in the NYT dataset)

2018

pdf bib
Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus
Jintae Kim | Hyeon-Gu Lee | Harksoo Kim | Yeonsoo Lee | Young-Gil Kim
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)

2016

pdf bib
KSAnswer: Question-answering System of Kangwon National University and Sogang University in the 2016 BioASQ Challenge
Hyeon-gu Lee | Minkyoung Kim | Harksoo Kim | Juae Kim | Sunjae Kwon | Jungyun Seo | Yi-reun Kim | Jung-Kyu Choi
Proceedings of the Fourth BioASQ workshop

2008

pdf bib
Information extraction using finite state automata and syllable n-grams in a mobile environment
Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of the ACL-08: HLT Workshop on Mobile Language Processing

pdf bib
Speakers’ Intention Prediction Using Statistics of Multi-level Features in a Schedule Management Domain
Donghyun Kim | Hyunjung Lee | Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of ACL-08: HLT, Short Papers

2002

pdf bib
A Reliable Indexing Method for a Practical QA System
Harksoo Kim | Jungyun Seo
COLING-02: Multilingual Summarization and Question Answering

2001

pdf bib
MAYA: A Fast Question-answering System Based on a Predictive Answer Indexer
Harksoo Kim | Kyungsun Kim | Gary Geunbae Lee | Jungyun Seo
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering

1999

pdf bib
Anaphora Resolution using Extended Centen’ng Algorithm in a Multi-modal Dialogue System
Harksoo Kim | Jeong-Mi Cho | Jungyun Seo
The Relation of Discourse/Dialogue Structure and Reference