2021
pdf
bib
abs
CroAno : A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset
Baoli Zhang
|
Zhucong Li
|
Zhen Gan
|
Yubo Chen
|
Jing Wan
|
Kang Liu
|
Jun Zhao
|
Shengping Liu
|
Yafei Shi
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
In this paper, we introduce CroAno, a web-based crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.
pdf
bib
abs
Classification, Extraction, and Normalization : CASIA_Unisound Team at the Social Media Mining for Health 2021 Shared Tasks
Tong Zhou
|
Zhucong Li
|
Zhen Gan
|
Baoli Zhang
|
Yubo Chen
|
Kun Niu
|
Jing Wan
|
Kang Liu
|
Jun Zhao
|
Yafei Shi
|
Weifeng Chong
|
Shengping Liu
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
This is the system description of the CASIA_Unisound team for Task 1, Task 7b, and Task 8 of the sixth Social Media Mining for Health Applications (SMM4H) shared task in 2021. Targeting on deal with two shared challenges, the colloquial text and the imbalance annotation, among those tasks, we apply a customized pre-trained language model and propose various training strategies. Experimental results show the effectiveness of our system. Moreover, we got an F1-score of 0.87 in task 8, which is the highest among all participates.
2009
pdf
bib
Parsing Syntactic and Semantic Dependencies for Multiple Languages with A Pipeline Approach
Han Ren
|
Donghong Ji
|
Jing Wan
|
Mingyao Zhang
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task
pdf
bib
Finding Answers to Definition Questions Using Web Knowledge Bases
Han Ren
|
Donghong Ji
|
Jing Wan
|
Chong Teng
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2
2008
pdf
bib
Automatic Chinese Catchword Extraction Based on Time Series Analysis
Han Ren
|
Donghong Ji
|
Jing Wan
|
Lei Han
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning