2025
pdf
bib
abs
Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation
Seungseop Lim
|
Gibaeg Kim
|
Wooseok Han
|
Jean Seo
|
Hyunkyung Lee
|
Jaehyo Yoo
|
Eunho Yang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Recent advances in Large Language Models (LLMs) have brought significant improvements to various service domains, including chatbots and medical pre-consultation applications. In the healthcare domain, the most common approach for adapting LLMs to multi-turn dialogue generation is Supervised Fine-Tuning (SFT). However, datasets for SFT in tasks like medical pre-consultation typically exhibit a skewed turn-count distribution. Training on such data induces a novel failure mechanism we term **Format Inertia**, where models tend to generate repetitive, format-correct, but diagnostically uninformative questions in long medical dialogues. To mitigate this observed failure mechanism, we adopt a simple, data-centric method that rebalances the turn-count distribution of the training dataset. Experimental results show that our approach substantially alleviates Format Inertia in medical pre-consultation.
pdf
bib
abs
Taxonomy of Comprehensive Safety for Clinical Agents
Jean Seo
|
Hyunkyung Lee
|
Gibaeg Kim
|
Wooseok Han
|
Jaehyo Yoo
|
Seungseop Lim
|
Kihun Shin
|
Eunho Yang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Safety is a paramount concern in clinical chatbot applications, where inaccurate or harmful responses can lead to serious consequences. Existing methods—such as guardrails and tool-calling—often fall short in addressing the nuanced demands of the clinical domain. In this paper, we introduce TACOS(Taxonomy of Comprehensive Safety for Clinical Agents), a fine-grained, 21-class taxonomy that integrates safety filtering and tool selection into a single user intent classification step. TACOS covers a wide spectrum of clinical and non-clinical queries, explicitly modeling varying safety thresholds and external tool dependencies. To validate our taxonomy, we curate a TACOS-annotated dataset and perform extensive experiments. Our results demonstrate the value of a new taxonomy specialized for clinical agent settings, and reveal valuable insights about train data distribution and pretrained knowledge of base models.
pdf
bib
abs
MoFE: Mixture of Frozen Experts Architecture
Jean Seo
|
Jaeyoon Kim
|
Hyopil Shin
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
We propose the Mixture of Frozen Experts (MoFE) architecture, which integrates Parameter-efficient Fine-tuning (PEFT) and the Mixture of Experts (MoE) architecture to enhance both training efficiency and model scalability. By freezing the Feed Forward Network (FFN) layers within the MoE framework, MoFE significantly reduces the number of trainable parameters, improving training efficiency while still allowing for effective knowledge transfer from the expert models. This facilitates the creation of models proficient in multiple domains. We conduct experiments to evaluate the trade-offs between performance and efficiency, compare MoFE with other PEFT methodologies, assess the impact of domain expertise in the constituent models, and determine the optimal training strategy. The results show that, although there may be some trade-offs in performance, the efficiency gains are substantial, making MoFE a reasonable solution for real-world, resource-constrained environments.
2024
pdf
bib
abs
ManWav: The First Manchu ASR Model
Jean Seo
|
Minha Kang
|
SungJoo Byun
|
Sangah Lee
Proceedings of the Third Workshop on NLP Applications to Field Linguistics
This study addresses the widening gap in Automatic Speech Recognition (ASR) research between high resource and extremely low resource languages, with a particular focus on Manchu, a severely endangered language. Manchu exemplifies the challenges faced by marginalized linguistic communities in accessing state-of-the-art technologies. In a pioneering effort, we introduce the first-ever Manchu ASR model ManWav, leveraging Wav2Vec2-XLSR-53. The results of the first Manchu ASR is promising, especially when trained with our augmented data. Wav2Vec2-XLSR-53 fine-tuned with augmented data demonstrates a 0.02 drop in CER and 0.13 drop in WER compared to the same base model fine-tuned with original data.
pdf
bib
abs
Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition
Sungjoo Byun
|
Jiseung Hong
|
Sumin Park
|
Dongjun Jang
|
Jean Seo
|
Minseok Kim
|
Chaeyoung Oh
|
Hyopil Shin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Named Entity Recognition (NER) plays a pivotal role in medical Natural Language Processing (NLP). Yet, there has not been an open-source medical NER dataset specifically for the Korean language. To address this, we utilized ChatGPT to assist in constructing the KBMC (Korean Bio-Medical Corpus), which we are now presenting to the public. With the KBMC dataset, we noticed an impressive 20% increase in medical NER performance compared to models trained on general Korean NER datasets. This research underscores the significant benefits and importance of using specialized tools and datasets, like ChatGPT, to enhance language processing in specialized fields such as healthcare.
pdf
bib
abs
ManNER & ManPOS: Pioneering NLP for Endangered Manchu Language
Sangah Lee
|
Sungjoo Byun
|
Jean Seo
|
Minha Kang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
We present pioneering research in the realm of Natural Language Processing (NLP) for the endangered Manchu language. Recognizing the critical importance of linguistic preservation, we experiment with three language models – BiLSTM-CRF, BERT, and mBERT – for Named Entity Recognition (NER) and Part-of-Speech (POS) tagging tasks. Given the limited digitized Manchu text available, we augment the data using GloVe embeddings for the pre-training of BERT-based models. Remarkably, all models demonstrated outstanding performance, achieving over 90% F1 score in both NER and POS tagging tasks. Our research not only marks the first application of NLP on Manchu and the inaugural use of BERT-based models for the language but also stands as the first endeavor to employ Manchu for NER and POS tagging. To foster further exploration and applications in the field, we make our fine-tuning dataset and models available to the public. Through this research, we aim to underscore the significance of NLP in the protection and revitalization of low-resource languages.
2023
pdf
bib
Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data
Jean Seo
|
Sungjoo Byun
|
Minha Kang
|
Sangah Lee
Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL)