Sangchul Hahn
2026
LLM Plug-ins Are Not a Free Lunch for Clinical Time-Series Prediction
Juhwan Choi | Kwanhyung Lee | Sangchul Hahn | Eunho Yang
Proceedings of the 1st Workshop on Linguistic Analysis for Health (HeaLing 2026)
Juhwan Choi | Kwanhyung Lee | Sangchul Hahn | Eunho Yang
Proceedings of the 1st Workshop on Linguistic Analysis for Health (HeaLing 2026)
Inspired by recent plug-in frameworks that repurpose frozen layers from large language models (LLMs) as inductive priors, we explore whether such mechanisms can be extended to clinical time-series prediction without textual inputs or LLM fine-tuning. We introduce a lightweight plug-in architecture that inserts a single frozen LLM Transformer layer between an aggregated time-series representation and the prediction head. Unlike prior work focused on vision or language tasks, our study targets clinical time-series data, where LLMs typically underperform when applied directly.Experiments on two ICU prediction tasks from MIMIC-III show that the proposed plug-in exhibits heterogeneous effects across different backbones and tasks, with occasional performance improvements and minimal computational overhead. We further compare general-purpose and medical-domain LLM layers under an identical plug-in setting, analyzing how domain specialization interacts with clinical time-series models. Overall, our results highlight important limitations of frozen LLM plug-ins and motivate future work on understanding the conditions under which such layers may be beneficial.
2019
Self-Knowledge Distillation in Natural Language Processing
Sangchul Hahn | Heeyoul Choi
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Sangchul Hahn | Heeyoul Choi
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Since deep learning became a key player in natural language processing (NLP), many deep learning models have been showing remarkable performances in a variety of NLP tasks. Such high performance can be explained by efficient knowledge representation of deep learning models. Knowledge distillation from pretrained deep networks suggests that we can use more information from the soft target probability to train other neural networks. In this paper, we propose a self-knowledge distillation method, based on the soft target probabilities of the training model itself, where multimode information is distilled from the word embedding space right below the softmax layer. Due to the time complexity, our method approximates the soft target probabilities. In experiments, we applied the proposed method to two different and fundamental NLP tasks: language model and neural machine translation. The experiment results show that our proposed method improves performance on the tasks.