Dachun Sun


2025

pdf bib
DocCHA: Towards LLM-Augmented Interactive Online diagnosis System
Xinyi Liu | Dachun Sun | Yi Fung | Dilek Hakkani-Tur | Tarek F. Abdelzaher
Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Despite the impressive capabilities of Large Language Models (LLMs), existing Conversational Health Agents (CHAs) remain static and brittle, incapable of adaptive multi-turn reasoning, symptom clarification, or transparent decision-making. This hinders their real-world applicability in clinical diagnosis, where iterative and structured dialogue is essential. We propose DocCHA, a confidence-aware, modular framework that emulates clinical reasoning by decomposing the diagnostic process into three stages: (1) symptom elicitation, (2) history acquisition, and (3) causal graph construction. Each module uses interpretable confidence scores to guide adaptive questioning, prioritize informative clarifications, and refine weak reasoning links. Evaluated on two real-world Chinese consultation datasets (IMCS21, DX), DocCHA consistently outperforms strong prompting-based LLM baselines (GPT-3.5, GPT-4o, LLaMA-3), achieving up to 5.18% higher diagnostic accuracy and over 30% improvement in symptom recall, with only modest increase in dialogue turns. These results demonstrate DocCHA’s effectiveness in enabling structured, transparent, and efficient diagnostic conversations—paving the way for trustworthy LLM-powered clinical assistants in multilingual and resource-constrained settings.

2023

pdf bib
Noisy Positive-Unlabeled Learning with Self-Training for Speculative Knowledge Graph Reasoning
Ruijie Wang | Baoyu Li | Yichen Lu | Dachun Sun | Jinning Li | Yuchen Yan | Shengzhong Liu | Hanghang Tong | Tarek Abdelzaher
Findings of the Association for Computational Linguistics: ACL 2023

This paper studies speculative reasoning task on real-world knowledge graphs (KG) that contain both false negative issue (i.e., potential true facts being excluded) and false positive issue (i.e., unreliable or outdated facts being included). State-of-the-art methods fall short in the speculative reasoning ability, as they assume the correctness of a fact is solely determined by its presence in KG, making them vulnerable to false negative/positive issues. The new reasoning task is formulated as a noisy Positive-Unlabeled learning problem. We propose a variational framework, namely nPUGraph, that jointly estimates the correctness of both collected and uncollected facts (which we call label posterior) and updates model parameters during training. The label posterior estimation facilitates speculative reasoning from two perspectives. First, it improves the robustness of a label posterior-aware graph encoder against false positive links. Second, it identifies missing facts to provide high-quality grounds of reasoning. They are unified in a simple yet effective self-training procedure. Empirically, extensive experiments on three benchmark KG and one Twitter dataset with various degrees of false negative/positive cases demonstrate the effectiveness of nPUGraph.