Hao He


2021

pdf bib
End-to-End Conversational Search for Online Shopping with Utterance Transfer
Liqiang Xiao | Jun Ma | Xin Luna Dong | Pascual Martínez-Gómez | Nasser Zalmout | Wei Chen | Tong Zhao | Hao He | Yaohui Jin
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Successful conversational search systems can present natural, adaptive and interactive shopping experience for online shopping customers. However, building such systems from scratch faces real word challenges from both imperfect product schema/knowledge and lack of training dialog data. In this work we first propose ConvSearch, an end-to-end conversational search system that deeply combines the dialog system with search. It leverages the text profile to retrieve products, which is more robust against imperfect product schema/knowledge compared with using product attributes alone. We then address the lack of data challenges by proposing an utterance transfer approach that generates dialogue utterances by using existing dialog from other domains, and leveraging the search behavior data from e-commerce retailer. With utterance transfer, we introduce a new conversational search dataset for online shopping. Experiments show that our utterance transfer method can significantly improve the availability of training dialogue data without crowd-sourcing, and the conversational search system significantly outperformed the best tested baseline.

pdf bib
Diagnosing the First-Order Logical Reasoning Ability Through LogicNLI
Jidong Tian | Yitian Li | Wenqing Chen | Liqiang Xiao | Hao He | Yaohui Jin
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently, language models (LMs) have achieved significant performance on many NLU tasks, which has spurred widespread interest for their possible applications in the scientific and social area. However, LMs have faced much criticism of whether they are truly capable of reasoning in NLU. In this work, we propose a diagnostic method for first-order logic (FOL) reasoning with a new proposed benchmark, LogicNLI. LogicNLI is an NLI-style dataset that effectively disentangles the target FOL reasoning from commonsense inference and can be used to diagnose LMs from four perspectives: accuracy, robustness, generalization, and interpretability. Experiments on BERT, RoBERTa, and XLNet, have uncovered the weaknesses of these LMs on FOL reasoning, which motivates future exploration to enhance the reasoning ability.

pdf bib
De-Confounded Variational Encoder-Decoder for Logical Table-to-Text Generation
Wenqing Chen | Jidong Tian | Yitian Li | Hao He | Yaohui Jin
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Logical table-to-text generation aims to automatically generate fluent and logically faithful text from tables. The task remains challenging where deep learning models often generated linguistically fluent but logically inconsistent text. The underlying reason may be that deep learning models often capture surface-level spurious correlations rather than the causal relationships between the table x and the sentence y. Specifically, in the training stage, a model can get a low empirical loss without understanding x and use spurious statistical cues instead. In this paper, we propose a de-confounded variational encoder-decoder (DCVED) based on causal intervention, learning the objective p(y|do(x)). Firstly, we propose to use variational inference to estimate the confounders in the latent space and cooperate with the causal intervention based on Pearl’s do-calculus to alleviate the spurious correlations. Secondly, to make the latent confounder meaningful, we propose a back-prediction process to predict the not-used entities but linguistically similar to the exactly selected ones. Finally, since our variational model can generate multiple candidates, we train a table-text selector to find out the best candidate sentence for the given table. An extensive set of experiments show that our model outperforms the baselines and achieves new state-of-the-art performance on two logical table-to-text datasets in terms of logical fidelity.

2020

pdf bib
A Semantically Consistent and Syntactically Variational Encoder-Decoder Framework for Paraphrase Generation
Wenqing Chen | Jidong Tian | Liqiang Xiao | Hao He | Yaohui Jin
Proceedings of the 28th International Conference on Computational Linguistics

Paraphrase generation aims to generate semantically consistent sentences with different syntactic realizations. Most of the recent studies rely on the typical encoder-decoder framework where the generation process is deterministic. However, in practice, the ability to generate multiple syntactically different paraphrases is important. Recent work proposed to cooperate variational inference on a target-related latent variable to introduce the diversity. But the latent variable may be contaminated by the semantic information of other unrelated sentences, and in turn, change the conveyed meaning of generated paraphrases. In this paper, we propose a semantically consistent and syntactically variational encoder-decoder framework, which uses adversarial learning to ensure the syntactic latent variable be semantic-free. Moreover, we adopt another discriminator to improve the word-level and sentence-level semantic consistency. So the proposed framework can generate multiple semantically consistent and syntactically different paraphrases. The experiments show that our model outperforms the baseline models on the metrics based on both n-gram matching and semantic similarity, and our model can generate multiple different paraphrases by assembling different syntactic variables.

pdf bib
Exploring Logically Dependent Multi-task Learning with Causal Inference
Wenqing Chen | Jidong Tian | Liqiang Xiao | Hao He | Yaohui Jin
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Previous studies have shown that hierarchical multi-task learning (MTL) can utilize task dependencies by stacking encoders and outperform democratic MTL. However, stacking encoders only considers the dependencies of feature representations and ignores the label dependencies in logically dependent tasks. Furthermore, how to properly utilize the labels remains an issue due to the cascading errors between tasks. In this paper, we view logically dependent MTL from the perspective of causal inference and suggest a mediation assumption instead of the confounding assumption in conventional MTL models. We propose a model including two key mechanisms: label transfer (LT) for each task to utilize the labels of all its lower-level tasks, and Gumbel sampling (GS) to deal with cascading errors. In the field of causal inference, GS in our model is essentially a counterfactual reasoning process, trying to estimate the causal effect between tasks and utilize it to improve MTL. We conduct experiments on two English datasets and one Chinese dataset. Experiment results show that our model achieves state-of-the-art on six out of seven subtasks and improves predictions’ consistency.

pdf bib
Modeling Content Importance for Summarization with Pre-trained Language Models
Liqiang Xiao | Lu Wang | Hao He | Yaohui Jin
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Modeling content importance is an essential yet challenging task for summarization. Previous work is mostly based on statistical methods that estimate word-level salience, which does not consider semantics and larger context when quantifying importance. It is thus hard for these methods to generalize to semantic units of longer text spans. In this work, we apply information theory on top of pre-trained language models and define the concept of importance from the perspective of information amount. It considers both the semantics and context when evaluating the importance of each semantic unit. With the help of pre-trained language models, it can easily generalize to different kinds of semantic units n-grams or sentences. Experiments on CNN/Daily Mail and New York Times datasets demonstrate that our method can better model the importance of content than prior work based on F1 and ROUGE scores.