OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems

Linyong Nan, Weining Fang, Aylin Rasteh, Pouya Lahabi, Weijin Zou, Yilun Zhao, Arman Cohan


Abstract
We introduce OMG-QA, a new resource for question answering that is designed to evaluate the effectiveness of question answering systems that perform retrieval augmented generation (RAG) in scenarios that demand reasoning on multi-modal, multi-document contexts. These systems, given a user query, must retrieve relevant contexts from the web, which may include non-textual information, and then reason and synthesize these contents to generate a detailed, coherent answer. Unlike existing open-domain QA datasets, OMG-QA requires systems to navigate and integrate diverse modalities and a broad pool of information sources, making it uniquely challenging. We conduct a thorough evaluation and analysis of a diverse set of QA systems, featuring various retrieval frameworks, document retrievers, document indexing approaches, evidence retrieval methods, and LLMs tasked with both information retrieval and generation. Our findings reveal significant limitations in existing approaches using RAG or LLM agents to address open questions that require long-form answers supported by multi-modal evidence. We believe that OMG-QA will be a valuable resource for developing QA systems that are better equipped to handle open-domain, multi-modal information-seeking tasks.
Anthology ID:
2024.emnlp-industry.75
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2024
Address:
Miami, Florida, US
Editors:
Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1001–1015
Language:
URL:
https://aclanthology.org/2024.emnlp-industry.75
DOI:
10.18653/v1/2024.emnlp-industry.75
Bibkey:
Cite (ACL):
Linyong Nan, Weining Fang, Aylin Rasteh, Pouya Lahabi, Weijin Zou, Yilun Zhao, and Arman Cohan. 2024. OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1001–1015, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):
OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems (Nan et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-industry.75.pdf