Kuicai Dong


2023

pdf bib
From Speculation Detection to Trustworthy Relational Tuples in Information Extraction
Kuicai Dong | Aixin Sun | Jung-jae Kim | Xiaoli Li
Findings of the Association for Computational Linguistics: EMNLP 2023

Speculation detection is an important NLP task to identify text factuality. However, the extracted speculative information (e.g., speculative polarity, cue, and scope) lacks structure and poses challenges for direct utilization in downstream tasks. Open Information Extraction (OIE), on the other hand, extracts structured tuples as facts, without examining the certainty of these tuples. Bridging this gap between speculation detection and information extraction becomes imperative to generate structured speculative information and trustworthy relational tuples. Existing studies on speculation detection are defined at sentence level; but even if a sentence is determined to be speculative, not all factual tuples extracted from it are speculative. In this paper, we propose to study speculations in OIE tuples and determine whether a tuple is speculative. We formally define the research problem of tuple-level speculation detection. We then conduct detailed analysis on the LSOIE dataset which provides labels for speculative tuples. Lastly, we propose a baseline model SpecTup for this new research task.

pdf bib
Open Information Extraction via Chunks
Kuicai Dong | Aixin Sun | Jung-jae Kim | Xiaoli Li
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Open Information Extraction (OIE) aims to extract relational tuples from open-domain sentences. Existing OIE systems split a sentence into tokens and recognize token spans as tuple relations and arguments. We instead propose Sentence as Chunk sequence (SaC) and recognize chunk spans as tuple relations and arguments. We argue that SaC has better properties for OIE than sentence as token sequence, and evaluate four choices of chunks (i.e., CoNLL chunks, OIA simple phrases, noun phrases, and spans from SpanOIE). Also, we propose a simple end-to-end BERT-based model, Chunk-OIE, for sentence chunking and tuple extraction on top of SaC. Chunk-OIE achieves state-of-the-art results on multiple OIE datasets, showing that SaC benefits the OIE task.

2022

pdf bib
Syntactic Multi-view Learning for Open Information Extraction
Kuicai Dong | Aixin Sun | Jung-Jae Kim | Xiaoli Li
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Open Information Extraction (OpenIE) aims to extract relational tuples from open-domain sentences. Traditional rule-based or statistical models were developed based on syntactic structure of sentence, identified by syntactic parsers. However, previous neural OpenIE models under-explored the useful syntactic information. In this paper, we model both constituency and dependency trees into word-level graphs, and enable neural OpenIE to learn from the syntactic structures. To better fuse heterogeneous information from the two graphs, we adopt multi-view learning to capture multiple relationships from them. Finally, the finetuned constituency and dependency representations are aggregated with sentential semantic representations for tuple generation. Experiments show that both constituency and dependency information, and the multi-view learning are effective.

2021

pdf bib
DocOIE: A Document-level Context-Aware Dataset for OpenIE
Kuicai Dong | Zhao Yilin | Aixin Sun | Jung-Jae Kim | Xiaoli Li
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021