基于批数据过采样的中医临床记录四诊描述抽取方法(Four Diagnostic Description Extraction in Clinical Records of Traditional Chinese Medicine with Batch Data Oversampling)

Yaqiang Wang (王亚强), Kailun Li (李凯伦), Yongguang Jiang (蒋永光), Hongping Shu (舒红平)


Abstract
“中医临床记录四诊描述抽取对中医临床辨证论治的提质增效具有重要的应用价值,然而该抽取任务尚有待探索,类别分布不均衡是该任务的关键挑战之一。本文围绕该任务展开研究,构建了中医临床四诊描述抽取语料库;基于无标注中医临床记录微调通用预训练语言模型实现领域适应;利用小规模标注数据,采用批数据过采样算法,实现中医临床记录四诊描述抽取模型的训练。实验结果表明本文提出方法的总体性能均优于对比方法,与对比方法的最优结果相比,本文提出的方法将少见类别的抽取性能F1值平均提升了2.13%。”
Anthology ID:
2022.ccl-1.55
Volume:
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Nanchang, China
Editors:
Maosong Sun (孙茂松), Yang Liu (刘洋), Wanxiang Che (车万翔), Yang Feng (冯洋), Xipeng Qiu (邱锡鹏), Gaoqi Rao (饶高琦), Yubo Chen (陈玉博)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
611–622
Language:
Chinese
URL:
https://aclanthology.org/2022.ccl-1.55
DOI:
Bibkey:
Cite (ACL):
Yaqiang Wang, Kailun Li, Yongguang Jiang, and Hongping Shu. 2022. 基于批数据过采样的中医临床记录四诊描述抽取方法(Four Diagnostic Description Extraction in Clinical Records of Traditional Chinese Medicine with Batch Data Oversampling). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 611–622, Nanchang, China. Chinese Information Processing Society of China.
Cite (Informal):
基于批数据过采样的中医临床记录四诊描述抽取方法(Four Diagnostic Description Extraction in Clinical Records of Traditional Chinese Medicine with Batch Data Oversampling) (Wang et al., CCL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ccl-1.55.pdf