Minghui Liu


2024

pdf bib
DocEE-zh: A Fine-grained Benchmark for Chinese Document-level Event Extraction
Minghui Liu | MeiHan Tong | Yangda Peng | Lei Hou | Juanzi Li | Bin Xu
Findings of the Association for Computational Linguistics: EMNLP 2024

Event extraction aims to identify events and then extract the arguments involved in those events. In recent years, there has been a gradual shift from sentence-level event extraction to document-level event extraction research. Despite the significant success achieved in English domain event extraction research, event extraction in Chinese still remains largely unexplored. However, a major obstacle to promoting Chinese document-level event extraction is the lack of fine-grained, wide domain coverage datasets for model training and evaluation. In this paper, we propose DocEE-zh, a new Chinese document-level event extraction dataset comprising over 36,000 events and more than 210,000 arguments. DocEE-zh is an extension of the DocEE dataset, utilizing the same event schema, and all data has been meticulously annotated by human experts. We highlight two features: focus on high-interest event types and fine-grained argument types. Experimental results indicate that state-of-the-art models still fail to achieve satisfactory performance, with an F1 score of 45.88% on the event argument extraction task, revealing that Chinese document-level event extraction (DocEE) remains an unresolved challenge. DocEE-zh is now available at https://github.com/tongmeihan1995/DocEE.git.

2021

pdf bib
Learning from Miscellaneous Other-Class Words for Few-shot Named Entity Recognition
Meihan Tong | Shuai Wang | Bin Xu | Yixin Cao | Minghui Liu | Lei Hou | Juanzi Li
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Few-shot Named Entity Recognition (NER) exploits only a handful of annotations to iden- tify and classify named entity mentions. Pro- totypical network shows superior performance on few-shot NER. However, existing prototyp- ical methods fail to differentiate rich seman- tics in other-class words, which will aggravate overfitting under few shot scenario. To address the issue, we propose a novel model, Mining Undefined Classes from Other-class (MUCO), that can automatically induce different unde- fined classes from the other class to improve few-shot NER. With these extra-labeled unde- fined classes, our method will improve the dis- criminative ability of NER classifier and en- hance the understanding of predefined classes with stand-by semantic knowledge. Experi- mental results demonstrate that our model out- performs five state-of-the-art models in both 1- shot and 5-shots settings on four NER bench- marks. We will release the code upon accep- tance. The source code is released on https: //github.com/shuaiwa16/OtherClassNER.git.