Seungpil Won

2024

pdf bib abs
Learning to Adapt Large Language Models to One-Shot In-Context Intent Classification on Unseen Domains
Joongbo Shin | Youbin Ahn | Seungpil Won | Stanley Jungkyu Choi
Proceedings of the 1st Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)

In this paper, we explore one-shot in-context intent classification using large language models (LLMs) with the goal of minimizing the effort required to adapt models to unseen domains. To enhance the one-shot in-context learning capabilities of LLMs, we employ in-context tuning, leveraging its cross-domain transferability to unseen domains.To this end, we introduce the IC-collection, a compilation of open-source intent classification datasets from diverse domains, which are meticulously divided into held-in and held-out datasets.Our experiments demonstrate the effectiveness of the proposed method, showing that our model, with only 7B parameters, not only outperforms GPT-4 on intent classification but also achieves state-of-the-art in unseen domains with only one-shot demonstrations.Both our benchmark and model will be made publicly available to advance research in the chatbot systems.

pdf bib abs
Exploring the Use of Natural Language Descriptions of Intents for Large Language Models in Zero-shot Intent Classification
Taesuk Hong | Youbin Ahn | Dongkyu Lee | Joongbo Shin | Seungpil Won | Janghoon Han | Stanley Jungkyu Choi | Jungyun Seo
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue

In task-oriented dialogue systems, intent classification is crucial for accurately understanding user queries and providing appropriate services. This study explores the use of intent descriptions with large language models for unseen domain intent classification. By examining the effects of description quality, quantity, and input length management, we identify practical guidelines for optimizing performance. Our experiments using FLAN-T5 3B demonstrate that 1) high-quality descriptions for both training and testing significantly improve accuracy, 2) diversity in training descriptions doesn’t greatly affect performance, and 3) off-the-shelf rankers selecting around ten intent options reduce input length without compromising performance. We emphasize that high-quality testing descriptions have a greater impact on accuracy than training descriptions. These findings provide practical guidelines for using intent descriptions with large language models to achieve effective and efficient intent classification in low-resource settings.

2023

pdf bib abs
BREAK: Breaking the Dialogue State Tracking Barrier with Beam Search and Re-ranking
Seungpil Won | Heeyoung Kwak | Joongbo Shin | Janghoon Han | Kyomin Jung
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite the recent advances in dialogue state tracking (DST), the joint goal accuracy (JGA) of the existing methods on MultiWOZ 2.1 still remains merely 60%. In our preliminary error analysis, we find that beam search produces a pool of candidates that is likely to include the correct dialogue state. Motivated by this observation, we introduce a novel framework, called BREAK (Beam search and RE-rAnKing), that achieves outstanding performance on DST. BREAK performs DST in two stages: (i) generating k-best dialogue state candidates with beam search and (ii) re-ranking the candidates to select the correct dialogue state. This simple yet powerful framework shows state-of-the-art performance on all versions of MultiWOZ and M2M datasets. Most notably, we push the joint goal accuracy to 80-90% on MultiWOZ 2.1-2.4, which is an improvement of 23.6%, 26.3%, 21.7%, and 10.8% over the previous best-performing models, respectively. The data and code will be available at https://github.com/tony-won/DST-BREAK

The goal of DSTC11 track 5 is to build task-oriented dialogue systems that can effectively utilize external knowledge sources such as FAQs and reviews. This year’s challenge differs from previous ones as it includes subjective knowledge snippets and requires multiple snippets for a single turn. We propose a pipeline system for the challenge focusing on entity tracking, knowledge selection and response generation. Specifically, we devise a novel heuristic to ensemble the outputs from the rule-based method and neural model for entity tracking and knowledge selection. We also leverage metadata information in the knowledge source to handle fine-grained user queries. Our approach achieved the first place in objective evaluation and the third place in human evaluation of DSTC11 track 5.

Co-authors

Venues

Fix data