Recent years have witnessed growing interests in incorporating external knowledge such as pre-trained word embeddings (PWEs) or pre-trained language models (PLMs) into neural topic modeling. However, we found that employing PWEs and PLMs for topic modeling only achieved limited performance improvements but with huge computational overhead. In this paper, we propose a novel strategy to incorporate external knowledge into neural topic modeling where the neural topic model is pre-trained on a large corpus and then fine-tuned on the target dataset. Experiments have been conducted on three datasets and results show that the proposed approach significantly outperforms both current state-of-the-art neural topic models and some topic modeling approaches enhanced with PWEs or PLMs. Moreover, further study shows that the proposed approach greatly reduces the need for the huge size of training data.
Context-dependent text-to-SQL is the task of translating multi-turn questions into database-related SQL queries. Existing methods typically focus on making full use of history context or previously predicted SQL for currently SQL parsing, while neglecting to explicitly comprehend the schema and conversational dependency, such as co-reference, ellipsis and user focus change. In this paper, we propose CQR-SQL, which uses auxiliary Conversational Question Reformulation (CQR) learning to explicitly exploit schema and decouple contextual dependency for multi-turn SQL parsing. Specifically, we first present a schema enhanced recursive CQR method to produce domain-relevant self-contained questions. Secondly, we train CQR-SQL models to map the semantics of multi-turn questions and auxiliary self-contained questions into the same latent space through schema grounding consistency task and tree-structured SQL parsing consistency task, which enhances the abilities of SQL parsing by adequately contextual understanding. At the time of writing, our CQR-SQL achieves new state-of-the-art results on two context-dependent text-to-SQL benchmarks SParC and CoSQL.
Relation detection in knowledge base question answering, aims to identify the path(s) of relations starting from the topic entity node that is linked to the answer node in knowledge graph. Such path might consist of multiple relations, which we call multi-hop. Moreover, for a single question, there may exist multiple relation paths to the correct answer, which we call multi-label. However, most of existing approaches only detect one single path to obtain the answer without considering other correct paths, which might affect the final performance. Therefore, in this paper, we propose a novel divide-and-conquer approach for multi-label multi-hop relation detection (DC-MLMH) by decomposing it into head relation detection and conditional relation path generation. In specific, a novel path sampling mechanism is proposed to generate diverse relation paths for the inference stage. A majority-vote policy is employed to detect final KB answer. Comprehensive experiments were conducted on the FreebaseQA benchmark dataset. Experimental results show that the proposed approach not only outperforms other competitive multi-label baselines, but also has superiority over some state-of-art KBQA methods.