Wanqing Cui
2024
MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning
Wanqing Cui
|
Keping Bi
|
Jiafeng Guo
|
Xueqi Cheng
Findings of the Association for Computational Linguistics: ACL 2024
LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMs
Sihui Yang
|
Keping Bi
|
Wanqing Cui
|
Jiafeng Guo
|
Xueqi Cheng
Findings of the Association for Computational Linguistics: EMNLP 2024
Non-Factoid (NF) Question Answering (QA) is challenging to evaluate due to diverse potential answers and no objective criterion. The commonly used automatic evaluation metrics like ROUGE or BERTScore cannot accurately measure semantic similarities or answers from different perspectives. Recently, Large Language Models (LLMs) have been resorted to for NFQA evaluation due to their compelling performance on various NLP tasks. Common approaches include pointwise scoring of each candidate answer and pairwise comparisons between answers. Inspired by the evolution from pointwise to pairwise to listwise in learning-to-rank methods, we propose a novel listwise NFQA evaluation approach, that utilizes LLMs to rank candidate answers in a list of reference answers sorted by descending quality. Moreover, for NF questions that do not have multi-grade or any golden answers, we leverage LLMs to generate the reference answer list of various quality to facilitate the listwise evaluation. Extensive experimental results on three NFQA datasets, i.e., ANTIQUE, the TREC-DL-NF, and WebGLM show that our method has significantly higher correlations with human annotations compared to automatic scores and common pointwise and pairwise approaches.
2020
Beyond Language: Learning Commonsense from Images for Reasoning
Wanqing Cui
|
Yanyan Lan
|
Liang Pang
|
Jiafeng Guo
|
Xueqi Cheng
Findings of the Association for Computational Linguistics: EMNLP 2020
This paper proposes a novel approach to learn commonsense from images, instead of limited raw texts or costly constructed knowledge bases, for the commonsense reasoning problem in NLP. Our motivation comes from the fact that an image is worth a thousand words, where richer scene information could be leveraged to help distill the commonsense knowledge, which is often hidden in languages. Our approach, namely Loire, consists of two stages. In the first stage, a bi-modal sequence-to-sequence approach is utilized to conduct the scene layout generation task, based on a text representation model ViBERT. In this way, the required visual scene knowledge, such as spatial relations, will be encoded in ViBERT by the supervised learning process with some bi-modal data like COCO. Then ViBERT is concatenated with a pre-trained language model to perform the downstream commonsense reasoning tasks. Experimental results on two commonsense reasoning problems, i.e.commonsense question answering and pronoun resolution, demonstrate that Loire outperforms traditional language-based methods. We also give some case studies to show what knowledge is learned from images and explain how the generated scene layout helps the commonsense reasoning process.
Search
Fix data
Co-authors
- Xueqi Cheng 3
- Jiafeng Guo (嘉丰 郭) 3
- Keping Bi 2
- Yanyan Lan 1
- Liang Pang 1
- show all...