2024
pdf
bib
abs
Towards Faithful Knowledge Graph Explanation Through Deep Alignment in Commonsense Question Answering
Weihe Zhai
|
Arkaitz Zubiaga
|
Bingquan Liu
|
Chengjie Sun
|
Yalong Zhao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The fusion of language models (LMs) and knowledge graphs (KGs) is widely used in commonsense question answering, but generating faithful explanations remains challenging. Current methods often overlook path decoding faithfulness, leading to divergence between graph encoder outputs and model predictions. We identify confounding effects and LM-KG misalignment as key factors causing spurious explanations. To address this, we introduce the LM-KG Fidelity metric to assess KG representation reliability and propose the LM-KG Distribution-aware Alignment (LKDA) algorithm to improve explanation faithfulness. Without ground truth, we evaluate KG explanations using the proposed Fidelity-Sparsity Trade-off Curve. Experiments on CommonsenseQA and OpenBookQA show that LKDA significantly enhances explanation fidelity and model performance, highlighting the need to address distributional misalignment for reliable commonsense reasoning.
2023
pdf
bib
abs
融合文本困惑度特征和相似度特征的推特机器人检测方法∗(Twitter robot detection method based on text perplexity feature and similarity feature)
Zhongjie Wang (王钟杰)
|
ZZhaowen Zhang (张朝文)
|
Wenqi Ding (丁文琪)
|
Yumeng Fu (付雨濛)
|
Lili Shan (单丽莉)
|
Bingquan Liu (刘秉权)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
“推特机器人检测任务的目标是判断一个推特账号是真人账号还是自动化机器人账号。随着自动化账号拟人算法的快速迭代,检测最新类别的自动化账号变得越来越困难。最近,预训练语言模型在自然语言生成任务和其他任务上表现出了出色的水平,当这些预训练语言模型被用于推特文本自动生成时,会为推特机器人检测任务带来很大挑战。本文研究发现,困惑度偏低和相似度偏高的现象始终出现在不同时代自动化账号的历史推文中,且此现象不受预训练语言模型的影响。针对这些发现,本文提出了一种抽取历史推文困惑度特征和相似度特征的方法,并设计了一种特征融合策略,以更好地将这些新特征应用于已有的算法模型。本文方法在选定数据集上的性能超越了已有的基准方法,并在人民网主办、传播内容认知全国重点实验室承办的社交机器人识别大赛上取得了冠军。”
2022
pdf
bib
abs
Pre-training Language Models with Deterministic Factual Knowledge
Shaobo Li
|
Xiaoguang Li
|
Lifeng Shang
|
Chengjie Sun
|
Bingquan Liu
|
Zhenzhou Ji
|
Xin Jiang
|
Qun Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Previous works show that Pre-trained Language Models (PLMs) can capture factual knowledge. However, some analyses reveal that PLMs fail to perform it robustly, e.g., being sensitive to the changes of prompts when extracting factual knowledge. To mitigate this issue, we propose to let PLMs learn the deterministic relationship between the remaining context and the masked content. The deterministic relationship ensures that the masked factual content can be deterministically inferable based on the existing clues in the context. That would provide more stable patterns for PLMs to capture factual knowledge than randomly masking. Two pre-training tasks are further introduced to motivate PLMs to rely on the deterministic relationship when filling masks. Specifically, we use an external Knowledge Base (KB) to identify deterministic relationships and continuously pre-train PLMs with the proposed methods. The factual knowledge probing experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing. Further experiments on question-answering datasets show that trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.
pdf
bib
abs
How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis
Shaobo Li
|
Xiaoguang Li
|
Lifeng Shang
|
Zhenhua Dong
|
Chengjie Sun
|
Bingquan Liu
|
Zhenzhou Ji
|
Xin Jiang
|
Qun Liu
Findings of the Association for Computational Linguistics: ACL 2022
Recently, there has been a trend to investigate the factual knowledge captured by Pre-trained Language Models (PLMs). Many works show the PLMs’ ability to fill in the missing factual words in cloze-style prompts such as ”Dante was born in [MASK].” However, it is still a mystery how PLMs generate the results correctly: relying on effective clues or shortcut patterns? We try to answer this question by a causal-inspired analysis that quantitatively measures and evaluates the word-level patterns that PLMs depend on to generate the missing words. We check the words that have three typical associations with the missing words: knowledge-dependent, positionally close, and highly co-occurred. Our analysis shows: (1) PLMs generate the missing factual words more by the positionally close and highly co-occurred words than the knowledge-dependent words; (2) the dependence on the knowledge-dependent words is more effective than the positionally close and highly co-occurred words. Accordingly, we conclude that the PLMs capture the factual knowledge ineffectively because of depending on the inadequate associations.
pdf
bib
abs
HIT&QMUL at SemEval-2022 Task 9: Label-Enclosed Generative Question Answering (LEG-QA)
Weihe Zhai
|
Mingqiang Feng
|
Arkaitz Zubiaga
|
Bingquan Liu
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
This paper presents the second place system for the R2VQ: competence-based multimodal question answering shared task. The purpose of this task is to involve semantic&cooking roles and text-images objects when querying how well a system understands the procedure of a recipe. This task is approached with text-to-text generative model based on transformer architecture. As a result, the model can well generalise to soft constrained and other competence-based question answering problem. We propose label enclosed input method which help the model achieve significant improvement from 65.34 (baseline) to 91.3. In addition to describing the submitted system, the impact of model architecture and label selection are investigated along with remarks regarding error analysis. Finally, future works are presented.
2021
pdf
bib
abs
Knowledge-Interactive Network with Sentiment Polarity Intensity-Aware Multi-Task Learning for Emotion Recognition in Conversations
Yunhe Xie
|
Kailai Yang
|
Chengjie Sun
|
Bingquan Liu
|
Zhenzhou Ji
Findings of the Association for Computational Linguistics: EMNLP 2021
Emotion Recognition in Conversation (ERC) has gained much attention from the NLP community recently. Some models concentrate on leveraging commonsense knowledge or multi-task learning to help complicated emotional reasoning. However, these models neglect direct utterance-knowledge interaction. In addition, these models utilize emotion-indirect auxiliary tasks, which provide limited affective information for the ERC task. To address the above issues, we propose a Knowledge-Interactive Network with sentiment polarity intensity-aware multi-task learning, namely KI-Net, which leverages both commonsense knowledge and sentiment lexicon to augment semantic information. Specifically, we use a self-matching module for internal utterance-knowledge interaction. Considering correlations with the ERC task, a phrase-level Sentiment Polarity Intensity Prediction (SPIP) task is devised as an auxiliary task. Experiments show that all knowledge integration, self-matching and SPIP modules improve the model performance respectively on three datasets. Moreover, our KI-Net model shows 1.04% performance improvement over the state-of-the-art model on the IEMOCAP dataset.
2020
pdf
bib
abs
CN-HIT-IT.NLP at SemEval-2020 Task 4: Enhanced Language Representation with Multiple Knowledge Triples
Yice Zhang
|
Jiaxuan Lin
|
Yang Fan
|
Peng Jin
|
Yuanchao Liu
|
Bingquan Liu
Proceedings of the Fourteenth Workshop on Semantic Evaluation
This paper describes our system that participated in the SemEval-2020 task 4: Commonsense Validation and Explanation. For this task, it is obvious that external knowledge, such as Knowledge graph, can help the model understand commonsense in natural language statements. But how to select the right triples for statements remains unsolved, so how to reduce the interference of irrelevant triples on model performance is a research focus. This paper adopt a modified K-BERT as the language encoder, to enhance language representation through triples from knowledge graphs. Experiments show that our method is better than models without external knowledge, and is slightly better than the original K-BERT. We got an accuracy score of 0.97 in subtaskA, ranking 1/45, and got an accuracy score of 0.948, ranking 2/35.
2019
pdf
bib
abs
Neural-based Chinese Idiom Recommendation for Enhancing Elegance in Essay Writing
Yuanchao Liu
|
Bo Pang
|
Bingquan Liu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Although the proper use of idioms can enhance the elegance of writing, the active use of various expressions is a challenge because remembering idioms is difficult. In this study, we address the problem of idiom recommendation by leveraging a neural machine translation framework, in which we suppose that idioms are written with one pseudo target language. Two types of real-life datasets are collected to support this study. Experimental results show that the proposed approach achieves promising performance compared with other baseline methods.
2018
pdf
bib
abs
LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics
Zhen Xu
|
Nan Jiang
|
Bingquan Liu
|
Wenge Rong
|
Bowen Wu
|
Baoxun Wang
|
Zhuoran Wang
|
Xiaolong Wang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
It has been proven that automatic conversational agents can be built up using the Endto-End Neural Response Generation (NRG) framework, and such a data-driven methodology requires a large number of dialog pairs for model training and reasonable evaluation metrics for testing. This paper proposes a Large Scale Domain-Specific Conversational Corpus (LSDSCC) composed of high-quality queryresponse pairs extracted from the domainspecific online forum, with thorough preprocessing and cleansing procedures. Also, a testing set, including multiple diverse responses annotated for each query, is constructed, and on this basis, the metrics for measuring the diversity of generated results are further presented. We evaluate the performances of neural dialog models with the widely applied diversity boosting strategies on the proposed dataset. The experimental results have shown that our proposed corpus can be taken as a new benchmark dataset for the NRG task, and the presented metrics are promising to guide the optimization of NRG models by quantifying the diversity of the generated responses reasonably.
pdf
bib
abs
ITNLP-ARC at SemEval-2018 Task 12: Argument Reasoning Comprehension with Attention
Wenjie Liu
|
Chengjie Sun
|
Lei Lin
|
Bingquan Liu
Proceedings of the 12th International Workshop on Semantic Evaluation
Reasoning is a very important topic and has many important applications in the field of natural language processing. Semantic Evaluation (SemEval) 2018 Task 12 “The Argument Reasoning Comprehension” committed to research natural language reasoning. In this task, we proposed a novel argument reasoning comprehension system, ITNLP-ARC, which use Neural Networks technology to solve this problem. In our system, the LSTM model is involved to encode both the premise sentences and the warrant sentences. The attention model is used to merge the two premise sentence vectors. Through comparing the similarity between the attention vector and each of the two warrant vectors, we choose the one with higher similarity as our system’s final answer.
2017
pdf
bib
abs
ITNLP-AiKF at SemEval-2017 Task 1: Rich Features Based SVR for Semantic Textual Similarity Computing
Wenjie Liu
|
Chengjie Sun
|
Lei Lin
|
Bingquan Liu
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Semantic Textual Similarity (STS) devotes to measuring the degree of equivalence in the underlying semantic of the sentence pair. We proposed a new system, ITNLP-AiKF, which applies in the SemEval 2017 Task1 Semantic Textual Similarity track 5 English monolingual pairs. In our system, rich features are involved, including Ontology based, word embedding based, Corpus based, Alignment based and Literal based feature. We leveraged the features to predict sentence pair similarity by a Support Vector Regression (SVR) model. In the result, a Pearson Correlation of 0.8231 is achieved by our system, which is a competitive result in the contest of this track.
pdf
bib
abs
Neural Response Generation via GAN with an Approximate Embedding Layer
Zhen Xu
|
Bingquan Liu
|
Baoxun Wang
|
Chengjie Sun
|
Xiaolong Wang
|
Zhuoran Wang
|
Chao Qi
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
This paper presents a Generative Adversarial Network (GAN) to model single-turn short-text conversations, which trains a sequence-to-sequence (Seq2Seq) network for response generation simultaneously with a discriminative classifier that measures the differences between human-produced responses and machine-generated ones. In addition, the proposed method introduces an approximate embedding layer to solve the non-differentiable problem caused by the sampling-based output decoding procedure in the Seq2Seq generative model. The GAN setup provides an effective way to avoid noninformative responses (a.k.a “safe responses”), which are frequently observed in traditional neural response generators. The experimental results show that the proposed approach significantly outperforms existing neural response generation models in diversity metrics, with slight increases in relevance scores as well, when evaluated on both a Mandarin corpus and an English corpus.
2014
pdf
bib
WINGS:Writing with Intelligent Guidance and Suggestions
Xianjun Dai
|
Yuanchao Liu
|
Xiaolong Wang
|
Bingquan Liu
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations
2013
pdf
bib
Multimodal DBN for Predicting High-Quality Answers in cQA portals
Haifeng Hu
|
Bingquan Liu
|
Baoxun Wang
|
Ming Liu
|
Xiaolong Wang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
2012
pdf
bib
Generating Questions from Web Community Contents
Baoxun Wang
|
Bingquan Liu
|
Chengjie Sun
|
Xiaolong Wang
|
Deyuan Zhang
Proceedings of COLING 2012: Demonstration Papers
2010
pdf
bib
Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
Baoxun Wang
|
Xiaolong Wang
|
Chengjie Sun
|
Bingquan Liu
|
Lin Sun
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
pdf
bib
Learning to Detect Hedges and their Scope Using CRF
Qi Zhao
|
Chengjie Sun
|
Bingquan Liu
|
Yong Cheng
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task
pdf
bib
CRF tagging for head recognition based on Stanford parser
Yong Cheng
|
Chengjie Sun
|
Bingquan Liu
|
Lei Lin
CIPS-SIGHAN Joint Conference on Chinese Language Processing
2007
pdf
bib
An Empirical Study of Non-Stationary Ngram Model and its Smoothing Techniques
Jinghui Xiao
|
Bingquan Liu
|
Xiaolong Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 2, June 2007
pdf
bib
Exploiting Pinyin Constraints in Pinyin-to-Character Conversion Task: a Class-Based Maximum Entropy Markov Model Approach
Jinghui Xiao
|
Bingquan Liu
|
Xiaolong Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from ISCSLP 2006
2005
pdf
bib
Principles of Non-stationary Hidden Markov Model and Its Applications to Sequence Labeling Task
JingHui Xiao
|
BingQuan Liu
|
XiaoLong Wang
Second International Joint Conference on Natural Language Processing: Full Papers