%0 Conference Proceedings %T SemEval-2022 Task 9: R2VQ – Competence-based Multimodal Question Answering %A Tu, Jingxuan %A Holderness, Eben %A Maru, Marco %A Conia, Simone %A Rim, Kyeongmin %A Lynch, Kelley %A Brutti, Richard %A Navigli, Roberto %A Pustejovsky, James %Y Emerson, Guy %Y Schluter, Natalie %Y Stanovsky, Gabriel %Y Kumar, Ritesh %Y Palmer, Alexis %Y Schneider, Nathan %Y Singh, Siddharth %Y Ratan, Shyam %S Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) %D 2022 %8 July %I Association for Computational Linguistics %C Seattle, United States %F tu-etal-2022-semeval %X In this task, we identify a challenge that is reflective of linguistic and cognitive competencies that humans have when speaking and reasoning. Particularly, given the intuition that textual and visual information mutually inform each other for semantic reasoning, we formulate a Competence-based Question Answering challenge, designed to involve rich semantic annotation and aligned text-video objects. The task is to answer questions from a collection of cooking recipes and videos, where each question belongs to a “question family” reflecting a specific reasoning competence. The data and task result is publicly available. %R 10.18653/v1/2022.semeval-1.176 %U https://aclanthology.org/2022.semeval-1.176 %U https://doi.org/10.18653/v1/2022.semeval-1.176 %P 1244-1255