基于双层语义映射的大语言模型辅助古汉语事件抽取半自动标注框架(A Semi-automatic Annotation Framework for Event Extraction in Classical Chinese Assisted by Large Language Models Based)

Wei Congcong (卫聪聪), Li Wei (李炜), Feng Zhenbing (冯振冰), Shao Yanqiu (邵艳秋)


Abstract
“尽管自然语言处理技术(歎歌歐)在现代语言事件抽取任务(歅歅)上已有较为成熟的解决方案,但针对古汉语事件抽取的研究却受限于标注数据匮乏和文本语义复杂等挑战。因而我们提出使用当前取得巨大成功的大语言模型(歌歌歍歳)来辅助人类标注员进行数据标注。为了应对歌歌歍歳在古汉语上存在的训练不足、语义理解能力欠缺的问题,我们提出了一种基于双层语义映射的歌歌歍歳辅助古汉语事件抽取半自动标注框架,利用古汉语的现代汉语译文,结合事件语义学理论及语义依存分析技术,为歌歌歍歳提供丰富的语义信息表示,从而进一步将语义依存关系逐步映射为具体的事件信息。经过人类标注员的审核反馈,有效克服了现有歎歌歐工具和歌歌歍歳在古汉语事件抽取标注时的局限。实验结果表明,我们的方法不仅提高了古汉语事件抽取标注的准确性和效率,而且减少了对专业人员的依赖和人工标注工作量,为低资源语言标注实践提供了新的方法论,探索了大模型时代数据标注的新方向。”
Anthology ID:
2024.ccl-1.46
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
588–599
Language:
Chinese
URL:
https://aclanthology.org/2024.ccl-1.46/
DOI:
Bibkey:
Cite (ACL):
Wei Congcong, Li Wei, Feng Zhenbing, and Shao Yanqiu. 2024. 基于双层语义映射的大语言模型辅助古汉语事件抽取半自动标注框架(A Semi-automatic Annotation Framework for Event Extraction in Classical Chinese Assisted by Large Language Models Based). In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 588–599, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
基于双层语义映射的大语言模型辅助古汉语事件抽取半自动标注框架(A Semi-automatic Annotation Framework for Event Extraction in Classical Chinese Assisted by Large Language Models Based) (Congcong et al., CCL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ccl-1.46.pdf