Haoran Meng


2024

pdf bib
DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade
Zefan Cai | Xin Zheng | Tianyu Liu | Haoran Meng | Jiaqi Han | Gang Yuan | Binghuai Lin | Baobao Chang | Yunbo Cao
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existing data accumulated in the last updates. Within the newly added data, new intents would emerge and might have semantic entanglement with the existing intents, e.g. new intents that are semantically too specific or generic are actually a subset or superset of some existing intents in the semantic space, thus impairing the robustness of the NLU model.As the first attempt to solve this problem, we setup a new benchmark consisting of 4 Dialogue Version Control dataSets (DialogVCS). We formulate the intent detection with imperfect data in the system update as a multi-label classification task with positive but unlabeled intents, which asks the models to recognize all the proper intents, including the ones with semantic entanglement, in the inference.We also propose comprehensive baseline models and conduct in-depth analyses for the benchmark, showing that the semantically entangled intents can be effectively recognized with an automatic workflow. Our code and dataset are available at https://github.com/Zefan-Cai/DialogVCS.

2023

pdf bib
DialogQAE: N-to-N Question Answer Pair Extraction from Customer Service Chatlog
Xin Zheng | Tianyu Liu | Haoran Meng | Xu Wang | Yufan Jiang | Mengliang Rao | Binghuai Lin | Yunbo Cao | Zhifang Sui
Findings of the Association for Computational Linguistics: EMNLP 2023

Harvesting question-answer (QA) pairs from customer service chatlog in the wild is an efficient way to enrich the knowledge base for customer service chatbots in the cold start or continuous integration scenarios. Prior work attempts to obtain 1-to-1 QA pairs from growing customer service chatlog, which fails to integrate the incomplete utterances from the dialog context for composite QA retrieval. In this paper, we propose N-to-N QA extraction task in which the derived questions and corresponding answers might be separated across different utterances. We introduce a suite of generative/discriminative tagging based methods with end-to-end and two-stage variants that perform well on 5 customer service datasets and for the first time setup a benchmark for N-to-N DialogQAE with utterance and session level evaluation metrics. With a deep dive into extracted QA pairs, we find that the relations between and inside the QA pairs can be indicators to analyze the dialogue structure, e.g. information seeking, clarification, barge-in and elaboration. We also show that the proposed models can adapt to different domains and languages, and reduce the labor cost of knowledge accumulation in the real-world product dialogue platform.

2022

pdf bib
DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection
Haoran Meng | Zheng Xin | Tianyu Liu | Zizhen Wang | He Feng | Binghuai Lin | Xuemin Zhao | Yunbo Cao | Zhifang Sui
Findings of the Association for Computational Linguistics: EMNLP 2022

While interacting with chatbots, users may elicit multiple intents in a single dialogue utterance. Instead of training a dedicated multi-intent detection model, we propose DialogUSR, a dialogue utterance splitting and reformulation task that first splits multi-intent user query into several single-intent sub-queries and then recovers all the coreferred and omitted information in the sub-queries. DialogUSR can serve as a plug-in and domain-agnostic module that empowers the multi-intent detection for the deployed chatbots with minimal efforts. We collect a high-quality naturally occurring dataset that covers 23 domains with a multi-step crowd-souring procedure. To benchmark the proposed dataset, we propose multiple action-based generative models that involve end-to-end and two-stage training, and conduct in-depth analyses on the pros and cons of the proposed baselines.

pdf bib
Learning Invariant Representation Improves Robustness for MRC Models
Yu Hai | Liang Wen | Haoran Meng | Tianyu Liu | Houfeng Wang
Findings of the Association for Computational Linguistics: EMNLP 2022

The prosperity of Pretrained Language Models(PLM) has greatly promoted the development of Machine Reading Comprehension (MRC). However, these models are vulnerable and not robust to adversarial examples. In this paper, we propose Stable and Contrastive Question Answering (SCQA) to improve invariance of representation to alleviate these robustness issues. Specifically, we first construct positive example pairs which have same answer through data augmentation. Then SCQA learns enhanced representations with better alignment between positive pairs by introducing stability and contrastive loss. Experimental results show that our approach can boost the robustness of QA models cross different MRC tasks and attack sets significantly and consistently.

pdf bib
Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues
Qingxiu Dong | Ziwei Qin | Heming Xia | Tian Feng | Shoujie Tong | Haoran Meng | Lin Xu | Zhongyu Wei | Weidong Zhan | Baobao Chang | Sujian Li | Tianyu Liu | Zhifang Sui
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

It is a common practice for recent works in vision language cross-modal reasoning to adopt a binary or multi-choice classification formulation taking as input a set of source image(s) and textual query. In this work, we take a sober look at such an “unconditional” formulation in the sense that no prior knowledge is specified with respect to the source image(s). Inspired by the designs of both visual commonsense reasoning and natural language inference tasks, we propose a new task termed “Premise-based Multi-modal Reasoning” (PMR) where a textual premise is the background presumption on each source image. The PMR dataset contains 15,360 manually annotated samples which are created by a multi-phase crowd-sourcing process. With selected high-quality movie screenshots and human-curated premise templates from 6 pre-defined categories, we ask crowd-source workers to write one true hypothesis and three distractors (4 choices) given the premise and image through a cross-check procedure.