Bowen Xing

2025

Knowledge base question answering (KBQA) aims to answer natural language questions by reasoning over structured knowledge bases. Existing approaches often struggle with the complexity of mapping questions to precise logical forms, particularly when dealing with diverse entities and relations. In this paper, we propose Hierarchical Topology Multi-task Learning (HTML), a novel framework that leverages a hierarchical multi-task learning paradigm to enhance the performance of logical form generation. Our framework consists of a main task: generating logical forms from questions, and three auxiliary tasks: entity prediction from the input question, relation prediction for the given entities, and logical form generation based on the given entities and relations. Through joint instruction-tuning, HTML allows mutual guidance and knowledge transfer among the hierarchical tasks, capturing the subtle dependencies between entities, relations, and logical forms. Extensive experiments on public benchmarks show that HTML markedly outperforms both supervised fine-tuning methods and training-free ones based on powerful large language models (e.g., GPT-4), demonstrating its superiority in question understanding and structural knowledge reasoning.

2024

pdf bib abs
DC-Instruct: An Effective Framework for Generative Multi-intent Spoken Language Understanding
Bowen Xing | Lizi Liao | Minlie Huang | Ivor Tsang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

In the realm of multi-intent spoken language understanding, recent advancements have leveraged the potential of prompt learning frameworks. However, critical gaps exist in these frameworks: the lack of explicit modeling of dual-task dependencies and the oversight of task-specific semantic differences among utterances. To address these shortcomings, we propose DC-Instruct, a novel generative framework based on Dual-task Inter-dependent Instructions (DII) and Supervised Contrastive Instructions (SCI). Specifically, DII guides large language models (LLMs) to generate labels for one task based on the other task’s labels, thereby explicitly capturing dual-task inter-dependencies. Moreover, SCI leverages utterance semantics differences by guiding LLMs to determine whether a pair of utterances share the same or similar labels. This can improve LLMs on extracting and discriminating task-specific semantics, thus enhancing their SLU reasoning abilities. Extensive experiments on public benchmark datasets show that DC-Instruct markedly outperforms current generative models and state-of-the-art methods, demonstrating its effectiveness in enhancing dialogue language understanding and reasoning.

2022

pdf bib abs
Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via Heterogeneous Semantics-Label Graphs
Bowen Xing | Ivor Tsang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recent graph-based models for joint multiple intent detection and slot filling have obtained promising results through modeling the guidance from the prediction of intents to the decoding of slot filling.However, existing methods (1) only model the unidirectional guidance from intent to slot; (2) adopt homogeneous graphs to model the interactions between the slot semantics nodes and intent label nodes, which limit the performance.In this paper, we propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks.In the first stage, the initial estimated labels of both tasks are produced, and then they are leveraged in the second stage to model the mutual guidances.Specifically, we propose two heterogeneous graph attention networks working on the proposed two heterogeneous semantics-label graphs, which effectively represent the relations among the semantics nodes and label nodes.Experiment results show that our model outperforms existing models by a large margin, obtaining a relative improvement of 19.3% over the previous best model on MixATIS dataset in overall accuracy.

pdf bib abs
Group is better than individual: Exploiting Label Topologies and Label Relations for Joint Multiple Intent Detection and Slot Filling
Bowen Xing | Ivor Tsang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recent joint multiple intent detection and slot filling models employ label embeddings to achieve the semantics-label interactions.However, they treat all labels and label embeddings as uncorrelated individuals, ignoring the dependencies among them. Besides, they conduct the decoding for the two tasks independently, without leveraging the correlations between them.Therefore, in this paper, we first construct a Heterogeneous Label Graph (HLG) containing two kinds of topologies: (1) statistical dependencies based on labels’ co-occurrence patterns and hierarchies in slot labels; (2) rich relations among the label nodes.Then we propose a novel model termed ReLa-Net.It can capture beneficial correlations among the labels from HLG.The label correlations are leveraged to enhance semantic-label interactions. Moreover, we also propose the label-aware inter-dependent decoding mechanism to further exploit the label correlations for decoding. Experiment results show that our ReLa-Net significantly outperforms previous models.Remarkably, ReLa-Net surpasses the previous best model by over 20% in terms of overall accuracy on MixATIS dataset.

pdf bib abs
DARER: Dual-task Temporal Relational Recurrent Reasoning Network for Joint Dialog Sentiment Classification and Act Recognition
Bowen Xing | Ivor Tsang
Findings of the Association for Computational Linguistics: ACL 2022

The task of joint dialog sentiment classification (DSC) and act recognition (DAR) aims to simultaneously predict the sentiment label and act label for each utterance in a dialog. In this paper, we put forward a new framework which models the explicit dependencies via integrating prediction-level interactions other than semantics-level interactions, more consistent with human intuition.Besides, we propose a speaker-aware temporal graph (SATG) and a dual-task relational temporal graph (DRTG) to introduce temporal relations into dialog understanding and dual-task reasoning. To implement our framework, we propose a novel model dubbed DARER, which first generates the context-, speaker- and temporal-sensitive utterance representations via modeling SATG, then conducts recurrent dual-task relational reasoning on DRTG, in which process the estimated label distributions act as key clues in prediction-level interactions.Experiment results show that DARER outperforms existing models by large margins while requiring much less computation resource and costing less training time.Remarkably, on DSC task in Mastodon, DARER gains a relative improvement of about 25% over previous best model in terms of F1, with less than 50% parameters and about only 60% required GPU memory.

Co-authors

Venues

emnlp3
findings2

Fix author