Jinghang Xu


2020

pdf bib
A Review of Dataset and Labeling Methods for Causality Extraction
Jinghang Xu | Wanli Zuo | Shining Liang | Xianglin Zuo
Proceedings of the 28th International Conference on Computational Linguistics

Causality represents the most important kind of correlation between events. Extracting causali-ty from text has become a promising hot topic in NLP. However, there is no mature research systems and datasets for public evaluation. Moreover, there is a lack of unified causal sequence label methods, which constitute the key factors that hinder the progress of causality extraction research. We survey the limitations and shortcomings of existing causality research field com-prehensively from the aspects of basic concepts, extraction methods, experimental data, and la-bel methods, so as to provide reference for future research on causality extraction. We summa-rize the existing causality datasets, explore their practicability and extensibility from multiple perspectives and create a new causal dataset ESC. Aiming at the problem of causal sequence labeling, we analyse the existing methods with a summarization of its regulation and propose a new causal label method of core word. Multiple candidate causal label sequences are put for-ward according to label controversy to explore the optimal label method through experiments, and suggestions are provided for selecting label method.