2024
pdf
bib
abs
DBQR-QA: A Question Answering Dataset on a Hybrid of Database Querying and Reasoning
Rungsiman Nararatwong
|
Chung-Chi Chen
|
Natthawut Kertkeidkachorn
|
Hiroya Takamura
|
Ryutaro Ichise
Findings of the Association for Computational Linguistics: ACL 2024
This paper introduces the Database Querying and Reasoning Dataset for Question Answering (DBQR-QA), aimed at addressing the gap in current question-answering (QA) research by emphasizing the essential processes of database querying and reasoning to answer questions. Specifically designed to accommodate sequential questions and multi-hop queries, DBQR-QA more accurately mirrors the dynamics of real-world information retrieval and analysis, with a particular focus on the financial reports of US companies. The dataset’s construction, the challenges encountered during its development, the performance of large language models on this dataset, and a human evaluation are thoroughly discussed to illustrate the dataset’s complexity and highlight future research directions in querying and reasoning tasks.
2022
pdf
bib
abs
KIQA: Knowledge-Infused Question Answering Model for Financial Table-Text Data
Rungsiman Nararatwong
|
Natthawut Kertkeidkachorn
|
Ryutaro Ichise
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures
While entity retrieval models continue to advance their capabilities, our understanding of their wide-ranging applications is limited, especially in domain-specific settings. We highlighted this issue by using recent general-domain entity-linking models, LUKE and GENRE, to inject external knowledge into a question-answering (QA) model for a financial QA task with a hybrid tabular-textual dataset. We found that both models improved the baseline model by 1.57% overall and 8.86% on textual data. Nonetheless, the challenge remains as they still struggle to handle tabular inputs. We subsequently conducted a comprehensive attention-weight analysis, revealing how LUKE utilizes external knowledge supplied by GENRE. The analysis also elaborates how the injection of symbolic knowledge can be helpful and what needs further improvement, paving the way for future research on this challenging QA task and advancing our understanding of how a language model incorporates external knowledge.
pdf
bib
abs
Enhancing Financial Table and Text Question Answering with Tabular Graph and Numerical Reasoning
Rungsiman Nararatwong
|
Natthawut Kertkeidkachorn
|
Ryutaro Ichise
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Typical financial documents consist of tables, texts, and numbers. Given sufficient training data, large language models (LM) can learn the tabular structures and perform numerical reasoning well in question answering (QA). However, their performances fall significantly when data and computational resources are limited. This study improves this performance drop by infusing explicit tabular structures through a graph neural network (GNN). We proposed a model developed from the baseline of a financial QA dataset named TAT-QA. The baseline model, TagOp, consists of answer span (evidence) extraction and numerical reasoning modules. As our main contributions, we introduced two components to the model: a GNN-based evidence extraction module for tables and an improved numerical reasoning module. The latter provides a solution to TagOp’s arithmetic calculation problem specific to operations requiring number ordering, such as subtraction and division, which account for a large portion of numerical reasoning. Our evaluation shows that the graph module has the advantage in low-resource settings, while the improved numerical reasoning significantly outperforms the baseline model.
pdf
bib
abs
iLab at FinCausal 2022: Enhancing Causality Detection with an External Cause-Effect Knowledge Graph
Ziwei Xu
|
Rungsiman Nararatwong
|
Natthawut Kertkeidkachorn
|
Ryutaro Ichise
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022
The application of span detection grows fast along with the increasing need of understanding the causes and effects of events, especially in the finance domain. However, once the syntactic clues are absent in the text, the model tends to reverse the cause and effect spans. To solve this problem, we introduce graph construction techniques to inject cause-effect graph knowledge for graph embedding. The graph features combining with BERT embedding, then are used to predict the cause effect spans. The results show our proposed graph builder method outperforms the other methods w/wo external knowledge.
2019
pdf
bib
abs
Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Takuma Ebisu
|
Ryutaro Ichise
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Knowledge graphs have been developed rapidly in recent years and shown their usefulness for many artificial intelligence tasks. However, knowledge graphs often have lots of missing facts. To solve this problem, many knowledge graph embedding models to populate knowledge graphs have been developed and have shown outstanding performance these days. However, knowledge graph embedding models are so called-black box. Hence, we actually does not know how information of a knowledge graph is processed and the models are hard to interpret. In this paper, we utilize graph patterns in a knowledge graph to overcome such problems. Our proposed model, graph pattern entity ranking Model (GRank), constructs an entity ranking system for each graph pattern and evaluate them using a measure for a ranking system. By doing so, we can find helpful graph patterns for predicting facts. Then we conduct the link prediction tasks on standard data sets to evaluate GRank. We show our approach outperforms other state-of-the-art approaches such as ComplEx and TorusE on standard metrics such as HITS@n and MRR. Moreover, This model is easily interpretable because output facts are described by graph patterns.