中文专利关键信息语料库的构建研究(Research on the construction of Chinese patent key information corpus)

Wenting Zhang (张文婷), Meihan Zhao (赵美含), Yixuan Ma (马翊轩), Wenrui Wang (王文瑞), Yuzhe Liu (刘宇哲), Muyun Yang (杨沐昀)


Abstract
“专利文献是一种重要的技术文献,是知识产权强国的重要工作内容。目前专利语料库多集中于信息检索、机器翻译以及文本文分类等领域,尚缺乏更细粒度的标注,不足以支持问答、阅读理解等新形态的人工智能技术研发。本文面向专利智能分析的需要,提出了从解决问题、技术手段、效果三个角度对发明专利进行专利标注,并最终构建了包含313篇的中文专利关键信息语料库。利用命名实体识别技术对语料库关键信息进行识别和验证,表明专利关键信息的识别是不同于领域命名实体识别的更大粒度的信息抽取难题。”
Anthology ID:
2022.ccl-1.41
Volume:
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Nanchang, China
Editors:
Maosong Sun (孙茂松), Yang Liu (刘洋), Wanxiang Che (车万翔), Yang Feng (冯洋), Xipeng Qiu (邱锡鹏), Gaoqi Rao (饶高琦), Yubo Chen (陈玉博)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
455–463
Language:
Chinese
URL:
https://aclanthology.org/2022.ccl-1.41
DOI:
Bibkey:
Cite (ACL):
Wenting Zhang, Meihan Zhao, Yixuan Ma, Wenrui Wang, Yuzhe Liu, and Muyun Yang. 2022. 中文专利关键信息语料库的构建研究(Research on the construction of Chinese patent key information corpus). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 455–463, Nanchang, China. Chinese Information Processing Society of China.
Cite (Informal):
中文专利关键信息语料库的构建研究(Research on the construction of Chinese patent key information corpus) (Zhang et al., CCL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ccl-1.41.pdf