Yitian Li


2022

pdf bib
To What Extent Do Natural Language Understanding Datasets Correlate to Logical Reasoning? A Method for Diagnosing Logical Reasoning.
Yitian Li | Jidong Tian | Wenqing Chen | Caoyun Fan | Hao He | Yaohui Jin
Proceedings of the 29th International Conference on Computational Linguistics

Reasoning and knowledge-related skills are considered as two fundamental skills for natural language understanding (NLU) tasks such as machine reading comprehension (MRC) and natural language inference (NLI). However, it is not clear to what extent an NLU task defined on a dataset correlates to a specific NLU skill. On the one hand, evaluating the correlation requires an understanding of the significance of the NLU skill in a dataset. Significance judges whether a dataset includes sufficient material to help the model master this skill. On the other hand, it is also necessary to evaluate the dependence of the task on the NLU skill. Dependence is a measure of how much the task defined on a dataset depends on the skill. In this paper, we propose a systematic method to diagnose the correlations between an NLU dataset and a specific skill, and then take a fundamental reasoning skill, logical reasoning, as an example for analysis. The method adopts a qualitative indicator to indicate the significance while adopting a quantitative indicator to measure the dependence. We perform diagnosis on 8 MRC datasets (including two types) and 3 NLI datasets and acquire intuitively reasonable results. We then perform the analysis to further understand the results and the proposed indicators. Based on the analysis, although the diagnostic method has some limitations, it is still an effective method to perform a basic diagnosis of the correlation between the dataset and logical reasoning skill, which also can be generalized to other NLU skills.

2021

pdf bib
Diagnosing the First-Order Logical Reasoning Ability Through LogicNLI
Jidong Tian | Yitian Li | Wenqing Chen | Liqiang Xiao | Hao He | Yaohui Jin
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently, language models (LMs) have achieved significant performance on many NLU tasks, which has spurred widespread interest for their possible applications in the scientific and social area. However, LMs have faced much criticism of whether they are truly capable of reasoning in NLU. In this work, we propose a diagnostic method for first-order logic (FOL) reasoning with a new proposed benchmark, LogicNLI. LogicNLI is an NLI-style dataset that effectively disentangles the target FOL reasoning from commonsense inference and can be used to diagnose LMs from four perspectives: accuracy, robustness, generalization, and interpretability. Experiments on BERT, RoBERTa, and XLNet, have uncovered the weaknesses of these LMs on FOL reasoning, which motivates future exploration to enhance the reasoning ability.

pdf bib
De-Confounded Variational Encoder-Decoder for Logical Table-to-Text Generation
Wenqing Chen | Jidong Tian | Yitian Li | Hao He | Yaohui Jin
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Logical table-to-text generation aims to automatically generate fluent and logically faithful text from tables. The task remains challenging where deep learning models often generated linguistically fluent but logically inconsistent text. The underlying reason may be that deep learning models often capture surface-level spurious correlations rather than the causal relationships between the table x and the sentence y. Specifically, in the training stage, a model can get a low empirical loss without understanding x and use spurious statistical cues instead. In this paper, we propose a de-confounded variational encoder-decoder (DCVED) based on causal intervention, learning the objective p(y|do(x)). Firstly, we propose to use variational inference to estimate the confounders in the latent space and cooperate with the causal intervention based on Pearl’s do-calculus to alleviate the spurious correlations. Secondly, to make the latent confounder meaningful, we propose a back-prediction process to predict the not-used entities but linguistically similar to the exactly selected ones. Finally, since our variational model can generate multiple candidates, we train a table-text selector to find out the best candidate sentence for the given table. An extensive set of experiments show that our model outperforms the baselines and achieves new state-of-the-art performance on two logical table-to-text datasets in terms of logical fidelity.