Yunxiao Zhao

2025

pdf bib abs
LOG: A Local-to-Global Optimization Approach for Retrieval-based Explainable Multi-Hop Question Answering
Hao Xu | Yunxiao Zhao | Jiayang Zhang | Zhiqiang Wang | Ru Li
Proceedings of the 31st International Conference on Computational Linguistics

Multi-hop question answering (MHQA) aims to utilize multi-source intensive documents retrieved to derive the answer. However, it is very challenging to model the importance of knowledge retrieved. Previous approaches primarily emphasize single-step and multi-step iterative decomposition or retrieval, which are susceptible to failure in long-chain reasoning due to the progressive accumulation of erroneous information. To address this problem, we propose a novel Local-tO-Global optimized retrieval method (LOG) to discover more beneficial information, facilitating the MHQA. In particular, we design a pointwise conditional v-information based local information modeling to cover usable documents with reasoning knowledge. We also improve tuplet objective loss, advancing multi-examples-aware global optimization to model the relationship between scattered documents. Extensive experimental results demonstrate our proposed method outperforms prior state-of-the-art models, and it can significantly improve multi-hop reasoning, notably for long-chain reasoning.

2024

pdf bib abs
AGR: Reinforced Causal Agent-Guided Self-explaining Rationalization
Yunxiao Zhao | Zhiqiang Wang | Xiaoli Li | Jiye Liang | Ru Li
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Most existing rationalization approaches are susceptible to degeneration accumulation due to a lack of effective control over the learning direction of the model during training. To address this issue, we propose a novel approach AGR (Agent-Guided Rationalization), guiding the next action of the model based on its current training state. Specifically, we introduce causal intervention calculus to quantify the causal effects inherent during rationale training, and utilize reinforcement learning process to refine the learning bias of them. Furthermore, we pretrain an agent within this reinforced causal environment to guide the next step of the model. We theoretically demonstrate that a good model needs the desired guidance, and empirically show the effectiveness of our approach, outperforming existing state-of-the-art methods on BeerAdvocate and HotelReview datasets.

2021

Co-authors

Hao Xu 1

Venues

Fix author