AuditWen: An Open-Source Large Language Model for Audit

Huang Jiajia, Zhu Haoran, Xu Chao, Zhan Tianming, Xie Qianqian, Huang Jimin


Abstract
“Intelligent auditing represents a crucial advancement in modern audit practices, enhancing boththe quality and efficiency of audits within the realm of artificial intelligence. With the rise oflarge language model (LLM), there is enormous potential for intelligent models to contribute toaudit domain. However, general LLMs applied in audit domain face the challenges of lackingspecialized knowledge and the presence of data biases. To overcome these challenges, this studyintroduces AuditWen, an open-source audit LLM by fine-tuning Qwen with constructing instruc-tion data from audit domain. We first outline the application scenarios for LLMs in the audit andextract requirements that shape the development of LLMs tailored for audit purposes. We thenpropose an audit LLM, called AuditWen, by fine-tuning Qwen with constructing 30k instructiondataset from 15 audit tasks and 3 layers. In evaluation stage, we proposed a benchmark with 5kinstructions that covers a set of critical audit tasks derived from the application scenarios. Withthe benchmark, we compare AuditWen with other existing LLMs from information extraction,question answering and document generation. The experimental results demonstrate superiorperformance of AuditWen both in question understanding and answer generation, making it animmediately valuable tool for audit.Keyword AuditWen, LLM, instruction dataset, fine-tuning, benchmarkIntroduction”
Anthology ID:
2024.ccl-1.104
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1351–1365
Language:
English
URL:
https://aclanthology.org/2024.ccl-1.104/
DOI:
Bibkey:
Cite (ACL):
Huang Jiajia, Zhu Haoran, Xu Chao, Zhan Tianming, Xie Qianqian, and Huang Jimin. 2024. AuditWen: An Open-Source Large Language Model for Audit. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 1351–1365, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
AuditWen: An Open-Source Large Language Model for Audit (Jiajia et al., CCL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ccl-1.104.pdf