Jinseok Seol


2026

Large language models (LLMs) provide excellent performance, but their practical deployment is limited by the substantial compute and memory demands of large models and the latency of auto-regressive decoding. To mitigate these inefficiencies, block pruning reduces the number of executed transformer blocks, effectively lowering latency while preserving architectural coherence. However, existing methods typically rely on representation similarity or computationally expensive sensitivity analyses to estimate block importance, thereby neglecting task-aware model behavior. To address this limitation, we introduce Task-aware Block Pruning (TaBP), a novel approach that directly captures task-specific inference dynamics by quantifying block-level uncertainty from the statistics of each block’s early-exited output distribution on a calibration dataset. Since output distributions reflect the model’s confidence and decision uncertainty conditioned on downstream tasks, these statistics provide a principled signal for identifying blocks that are less critical for task performance. Extensive experiments demonstrate that TaBP preserves downstream task performance while substantially reducing inference latency and computational cost, without relying on cost-heavy sensitivity analyses. To facilitate reproducibility and further research, we release our implementation of TaBP on [GitHub](https://github.com/Song-haJo/TaBP).

2024

Large language models (LLMs) are utilized in various studies, and they also demonstrate a potential to function independently as a recommendation model. Nevertheless, training sequences and text labels modifies LLMs’ pre-trained weights, diminishing their inherent strength in constructing and comprehending natural language sentences. In this study, we propose a reconstruction-based LLM recommendation model (ReLRec) that harnesses the feature extraction capability of LLMs, while preserving LLMs’ sentence generation abilities. We reconstruct the user and item pseudo-labels generated from user reviews, while training on sequential data, aiming to exploit the key features of both users and items. Experimental results demonstrate the efficacy of label reconstruction in sequential recommendation tasks.

2017

Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which word representation is composed of trained syllable vectors. Our model successfully produces morphologically meaningful representation of Korean words compared to the original Skip-gram embeddings. The results also show that it is quite robust to the Out-of-Vocabulary problem.