Leveraging LLM-Generated Schema Descriptions for Unanswerable Question Detection in Clinical Data

Donghee Han, Seungjae Lim, Daeyoung Roh, Sangryul Kim, Sehyun Kim, Mun Yong Yi


Abstract
Recent advancements in large language models (LLMs) have boosted research on generating SQL queries from domain-specific questions, particularly in the medical domain. A key challenge is detecting and filtering unanswerable questions. Existing methods often relying on model uncertainty, but these require extra resources and lack interpretability. We propose a lightweight model that predicts relevant database schemas to detect unanswerable questions, enhancing interpretability and addressing the data imbalance in binary classification tasks. Furthermore, we found that LLM-generated schema descriptions can significantly enhance the prediction accuracy. Our method provides a resource-efficient solution for unanswerable question detection in domain-specific question answering systems.
Anthology ID:
2025.coling-main.706
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10594–10601
Language:
URL:
https://aclanthology.org/2025.coling-main.706/
DOI:
Bibkey:
Cite (ACL):
Donghee Han, Seungjae Lim, Daeyoung Roh, Sangryul Kim, Sehyun Kim, and Mun Yong Yi. 2025. Leveraging LLM-Generated Schema Descriptions for Unanswerable Question Detection in Clinical Data. In Proceedings of the 31st International Conference on Computational Linguistics, pages 10594–10601, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Leveraging LLM-Generated Schema Descriptions for Unanswerable Question Detection in Clinical Data (Han et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.706.pdf