PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

Ming Zhang; Yuhui Wang; Yujiong Shen; Tingyi Yang; Changhao Jiang; Yilong Wu; Shihan Dou; Qinhao Chen; Zhiheng Xi; Zhihao Zhang; Yi Dong; Zhen Wang; Zhihui Fei; Mingyang Wan; Tao Liang; Guojun Ma; Qi Zhang; Tao Gui; Xuan-Jing Huang (黄萱菁)

doi:10.18653/v1/2025.findings-acl.134

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang

Abstract

Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct Process Flow Dialogue (PFDial) dataset, which contains 12,705 high-quality Chinese dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88% with an average of 11.00%. We further evaluate models’ performance on challenging backward transitions in process flows and conduct an in-depth analysis of various dataset formats to reveal their impact on model performance in handling decision and sequential branches. The data is released in https://github.com/KongLongGeFDU/PFDial.

Anthology ID:: 2025.findings-acl.134
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2626–2649
Language:
URL:: https://aclanthology.org/2025.findings-acl.134/
DOI:: 10.18653/v1/2025.findings-acl.134
Bibkey:
Cite (ACL):: Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, and Xuanjing Huang. 2025. PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts. In Findings of the Association for Computational Linguistics: ACL 2025, pages 2626–2649, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts (Zhang et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.134.pdf

PDF Cite Search Fix data