Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources

Jens Lehmann, Dhananjay Bhandiwad, Preetam Gattogi, Sahar Vahdati


Abstract
Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering over text and structured sources as separate pipelines followed by a merge step or (ii) provide an early integration, giving up the strengths of particular information sources. To solve this problem, we present “HumanIQ”, a method that teaches language models to dynamically combine retrieved information by imitating how humans use retrieval tools. Our approach couples a generic method for gathering human demonstrations of tool use with adaptive few-shot learning for tool augmented models. We show that HumanIQ confers significant benefits, including i) reducing the error rate of our strongest baseline (GPT-4) by over 50% across 3 benchmarks, (ii) improving human preference over responses from vanilla GPT-4 (45.3% wins, 46.7% ties, 8.0% loss), and (iii) outperforming numerous task-specific baselines.
Anthology ID:
2024.tacl-1.44
Volume:
Transactions of the Association for Computational Linguistics, Volume 12
Month:
Year:
2024
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
786–802
Language:
URL:
https://aclanthology.org/2024.tacl-1.44
DOI:
10.1162/tacl_a_00671
Bibkey:
Cite (ACL):
Jens Lehmann, Dhananjay Bhandiwad, Preetam Gattogi, and Sahar Vahdati. 2024. Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources. Transactions of the Association for Computational Linguistics, 12:786–802.
Cite (Informal):
Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources (Lehmann et al., TACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tacl-1.44.pdf