SeongKu Kang


2024

pdf bib
Taxonomy-guided Semantic Indexing for Academic Paper Search
SeongKu Kang | Yunyi Zhang | Pengcheng Jiang | Dongha Lee | Jiawei Han | Hwanjo Yu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Academic paper search is an essential task for efficient literature discovery and scientific advancement. While dense retrieval has advanced various ad-hoc searches, it often struggles to match the underlying academic concepts between queries and documents, which is critical for paper search. To enable effective academic concept matching for paper search, we propose Taxonomy-guided Semantic Indexing (TaxoIndex) framework. TaxoIndex extracts key concepts from papers and organizes them as a semantic index guided by an academic taxonomy, and then leverages this index as foundational knowledge to identify academic concepts and link queries and documents. As a plug-and-play framework, TaxoIndex can be flexibly employed to enhance existing dense retrievers. Extensive experiments show that TaxoIndex brings significant improvements, even with highly limited training data, and greatly enhances interpretability.

pdf bib
Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset
Minjin Kim | Minju Kim | Hana Kim | Beong-woo Kwak | SeongKu Kang | Youngjae Yu | Jinyoung Yeo | Dongha Lee
Findings of the Association for Computational Linguistics: ACL 2024

Conversational recommender systems are an emerging area that has garnered increasing interest in the community, especially with the advancements in large language models (LLMs) that enable sophisticated handling of conversational input. Despite the progress, the field still has many aspects left to explore. The currently available public datasets for conversational recommendation lack specific user preferences and explanations for recommendations, hindering high-quality recommendations. To address such challenges, we present a novel conversational recommendation dataset named PEARL, synthesized with persona- and knowledge-augmented LLM simulators. We obtain detailed persona and knowledge from real-world reviews and construct a large-scale dataset with over 57k dialogues. Our experimental results demonstrate that PEARL contains more specific user preferences, show expertise in the target domain, and provides recommendations more relevant to the dialogue context than those in prior datasets. Furthermore, we demonstrate the utility of PEARL by showing that our downstream models outperform baselines in both human and automatic evaluations. We release our dataset and code.

pdf bib
Self-Consistent Reasoning-based Aspect-Sentiment Quad Prediction with Extract-Then-Assign Strategy
Jieyong Kim | Ryang Heo | Yongsik Seo | SeongKu Kang | Jinyoung Yeo | Dongha Lee
Findings of the Association for Computational Linguistics: ACL 2024

In the task of aspect sentiment quad prediction (ASQP), generative methods for predicting sentiment quads have shown promisingresults. However, they still suffer from imprecise predictions and limited interpretability, caused by data scarcity and inadequate modeling of the quadruplet composition process. In this paper, we propose Self-Consistent Reasoning-based Aspect sentiment quadruple Prediction (SCRAP), optimizing its model to generate reasonings and the corresponding sentiment quadruplets in sequence. SCRAP adopts the Extract-Then-Assign reasoning strategy, which closely mimics human cognition. In the end, SCRAP significantly improves the model’s ability to handle complex reasoning tasks and correctly predict quadruplets through consistency voting, resulting in enhanced interpretability and accuracy in ASQP.