A multi-hop question answering (QA) dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question. However, current datasets do not provide a complete explanation for the reasoning process from the question to the answer. Further, previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop reasoning to answer a question. In this study, we present a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data. In our dataset, we introduce the evidence information containing a reasoning path for multi-hop questions. The evidence information has two benefits: (i) providing a comprehensive explanation for predictions and (ii) evaluating the reasoning skills of a model. We carefully design a pipeline and a set of templates when generating a question-answer pair that guarantees the multi-hop steps and the quality of the questions. We also exploit the structured format in Wikidata and use logical rules to create questions that are natural but still require multi-hop reasoning. Through experiments, we demonstrate that our dataset is challenging for multi-hop models and it ensures that multi-hop reasoning is required.
We present a neural exhaustive approach that addresses named entity recognition (NER) and relation recognition (RE), for the entity and re- lation recognition over the wet-lab protocols shared task. We introduce BERT-based neural exhaustive approach that enumerates all pos- sible spans as potential entity mentions and classifies them into entity types or no entity with deep neural networks to address NER. To solve relation extraction task, based on the NER predictions or given gold mentions we create all possible trigger-argument pairs and classify them into relation types or no relation. In NER task, we achieved 76.60% in terms of F-score as third rank system among the partic- ipated systems. In relation extraction task, we achieved 80.46% in terms of F-score as the top system in the relation extraction or recognition task. Besides we compare our model based on the wet lab protocols corpus (WLPC) with the WLPC baseline and dynamic graph-based in- formation extraction (DyGIE) systems.
This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.