Zhenfang Zhu


2024

pdf bib
HyperMR: Hyperbolic Hypergraph Multi-hop Reasoning for Knowledge-based Visual Question Answering
Bin Wang | Fuyong Xu | Peiyu Liu | Zhenfang Zhu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Knowledge-based Visual Question Answering (KBVQA) is a challenging task, which aims to answer an image related question based on external knowledge. Most of the works describe the semantic distance using the actual Euclidean distance between two nodes, which leads to distortion in modeling knowledge graphs with hierarchical and scale-free structure in KBVQA, and limits the multi-hop reasoning capability of the model. In contrast, the hyperbolic space shows exciting prospects for low-distortion embedding of graphs with hierarchical and free-scale structure. In addition, we map the different stages of reasoning into multiple adjustable hyperbolic spaces, achieving low-distortion, fine-grained reasoning. Extensive experiments on the KVQA, PQ and PQL datasets demonstrate the effectiveness of HyperMR for strong-hierarchy knowledge graphs.