Hey AI, Can You Solve Complex Tasks by Talking to Agents?

Tushar Khot, Kyle Richardson, Daniel Khashabi, Ashish Sabharwal


Abstract
Training giant models from scratch for each complex task is resource- and data-inefficient. To help develop models that can leverage existing systems, we propose a new challenge: Learning to solve complex tasks by communicating with existing agents (or models) in natural language. We design a synthetic benchmark, CommaQA, with three complex reasoning tasks (explicit, implicit, numeric) designed to be solved by communicating with existing QA agents. For instance, using text and table QA agents to answer questions such as “Who had the longest javelin throw from USA?”. We show that black-box models struggle to learn this task from scratch (accuracy under 50%) even with access to each agent’s knowledge and gold facts supervision. In contrast, models that learn to communicate with agents outperform black-box models, reaching scores of 100% when given gold decomposition supervision. However, we show that the challenge of learning to solve complex tasks by communicating with existing agents without relying on any auxiliary supervision or data still remains highly elusive. We will release CommaQA, along with a compositional generalization test split, to advance research in this direction.
Anthology ID:
2022.findings-acl.142
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venues:
ACL | Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1808–1823
Language:
URL:
https://aclanthology.org/2022.findings-acl.142
DOI:
10.18653/v1/2022.findings-acl.142
Bibkey:
Cite (ACL):
Tushar Khot, Kyle Richardson, Daniel Khashabi, and Ashish Sabharwal. 2022. Hey AI, Can You Solve Complex Tasks by Talking to Agents?. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1808–1823, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Hey AI, Can You Solve Complex Tasks by Talking to Agents? (Khot et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.142.pdf
Code
 allenai/commaqa
Data
DROPMathQA