Dangyang Chen

2025

pdf bib abs
CoMIF: Modeling of Complex Multiple Interaction Factors for Conversation Generation
Yuxuan Chen | Wei Wei | Shixuan Fan | Kaihe Xu | Dangyang Chen
Proceedings of the 31st International Conference on Computational Linguistics

Highly realistic human-machine interaction is challenging for open-domain dialogue systems. Although existing methods have achieved notable progress by leveraging various interaction factors (e.g., emotion, personality, topic) for delivering human-like (e.g., empathetic, personalized and semantically-consistent) responses, they typically model such factor alone and thus easily suffer from low-quality response generation issue. We attribute this limitation to the neglect of implicit-correlations among factors. Furthermore, different factors may alternately dominate token-level response generation during decoding, making it harder to generate high-quality responses by applying various factors at the sentence level. To address the issue, we present a unified response generation framework, which is capable of simultaneously modeling Complex Multiple Interaction Factors (named CoMIF) to generate human-like conversations. To model the implicit correlations among factors, CoMIF first employ a dynamic perception module to construct a directed collaborative-graph to jointly learn the dynamics over time of each factor, as well as the cross-dependencies among them. Additionally, we also design a scalable post-adaptation module to introduce token-level factor signals to generate more human-like responses with appropriately multiple factors. Extensive experiments over multiple datasets demonstrate that the proposed method achieves the superior performance in generating more human-like responses with appropriate multiple-factors, as compared to the state-of-the-art methods.

pdf bib abs
SA-DETR:Span Aware Detection Transformer for Moment Retrieval
Tianheng Xiong | Wei Wei | Kaihe Xu | Dangyang Chen
Proceedings of the 31st International Conference on Computational Linguistics

Moment Retrieval aims to locate specific video segments related to the given text. Recently, DETR-based methods, originating from Object Detection, have emerged as effective solutions for Moment Retrieval. These approaches focus on multimodal feature fusion and refining Queries composed of span anchor and content embedding. Despite the success, they often overlook the video-text instance related information in Query Initialization and the crucial guidance role of span anchors in Query Refinement, leading to inaccurate predictions. To address this, we propose a novel Span Aware DEtection TRansformer (SA-DETR) that leverages the importance of instance related span anchors. To fully leverage the instance related information, we generate span anchors based on video-text pair rather than using learnable parameters, as is common in conventional DETR-based methods, and supervise them with GT labels. To effectively exploit the correspondence between span anchors and video clips, we enhance content embedding guided by textual features and generate Gaussian mask to modulate the interaction between content embedding and fusion features. Furthermore, we explore the feature alignment across various stages and granularities and apply denoise learning to boost the span awareness of the model. Extensive experiments on QVHighlights, Charades-STA, and TACoS demonstrate the effectiveness of our approach.

2024

pdf bib abs
Confidence is not Timeless: Modeling Temporal Validity for Rule-based Temporal Knowledge Graph Forecasting
Rikui Huang | Wei Wei | Xiaoye Qu | Shengzhe Zhang | Dangyang Chen | Yu Cheng
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, Temporal Knowledge Graph Forecasting (TKGF) has emerged as a pivotal domain for forecasting future events. Unlike black-box neural network methods, rule-based approaches are lauded for their efficiency and interpretability. For this line of work, it is crucial to correctly estimate the predictive effectiveness of the rules, i.e., the confidence. However, the existing literature lacks in-depth investigation into how confidence evolves with time. Moreover, inaccurate and heuristic confidence estimation limits the performance of rule-based methods. To alleviate such issues, we propose a framework named TempValid to explicitly model the temporal validity of rules for TKGF. Specifically, we design a time function to model the interaction between temporal information with confidence. TempValid conceptualizes confidence and other coefficients as learnable parameters to avoid inaccurate estimation and combinatorial explosion. Furthermore, we introduce a rule-adversarial negative sampling and a time-aware negative sampling strategies to facilitate TempValid learning. Extensive experiments show that TempValid significantly outperforms previous state-of-the-art (SOTA) rule-based methods on six TKGF datasets. Moreover, it exhibits substantial advancements in cross-domain and resource-constrained rule learning scenarios.

To meet the requirements of real-world applications, it is essential to control generations of large language models (LLMs). Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a “first-quantize-then-noise” paradigm to enhance the robustness of the RL algorithm. Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG.

Recently, the topic-grounded dialogue (TGD) system has become increasingly popular as its powerful capability to actively guide users to accomplish specific tasks through topic-guided conversations. Most existing works utilize side information (e.g. topics or personas) in isolation to enhance the topic selection ability. However, due to disregarding the noise within these auxiliary information sources and their mutual influence, current models tend to predict user-uninteresting and contextually irrelevant topics. To build user-engaging and coherent dialogue agent, we propose a personalized topic selection model for topic-grounded dialogue, named PETD, which takes account of the interaction of side information to selectively aggregate such information for more accurately predicting subsequent topics. Specifically, we evaluate the correlation between global topics and personas and selectively incorporate the global topics aligned with user personas. Furthermore, we propose a contrastive learning based persona selector to filter relevant personas under the constraint of lacking pertinent persona annotations. Throughout the selection and generation, diverse relevant side information is considered. Extensive experiments demonstrate that our proposed method can generate engaging and diverse responses, outperforming state-of-the-art baselines across various evaluation metrics.

Text classification is a crucial task encountered frequently in practical scenarios, yet it is still under-explored in the era of large language models (LLMs). This study shows that LLMs are vulnerable to changes in the number and arrangement of options in text classification. Our extensive empirical analyses reveal that the key bottleneck arises from ambiguous decision boundaries and inherent biases towards specific tokens and positions.To mitigate these issues, we make the first attempt and propose a novel two-stage classification framework for LLMs. Our approach is grounded in the empirical observation that pairwise comparisons can effectively alleviate boundary ambiguity and inherent bias. Specifically, we begin with a self-reduction technique to efficiently narrow down numerous options, which contributes to reduced decision space and a faster comparison process. Subsequently, pairwise contrastive comparisons are employed in a chain-of-thought manner to draw out nuances and distinguish confusable options, thus refining the ambiguous decision boundary.Extensive experiments on four datasets (Banking77, HWU64, LIU54, and Clinic150) verify the effectiveness of our framework. Furthermore, benefitting from our framework, various LLMs can achieve consistent improvements. Our code and data are available in https://github.com/Chuge0335/PC-CoT.

pdf bib
Modeling Historical Relevant and Local Frequency Context for Representation-Based Temporal Knowledge Graph Forecasting
Shengzhe Zhang | Wei Wei | Rikui Huang | Wenfeng Xie | Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2024

pdf bib abs
CNEQ: Incorporating numbers into Knowledge Graph Reasoning
Xianshu Peng | Wei Wei | Kaihe Xu | Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2024

Complex logical reasoning over knowledge graphs lies at the heart of many semantic downstream applications and thus has been extensively explored in recent years. However, nearly all of them overlook the rich semantics of numerical entities (e.g., magnitude, unit, and distribution) and are simply treated as common entities, or even directly removed. It may severely hinder the performance of answering queries involving numerical comparison or numerical computation. To address this issue, we propose the Complex Number and Entity Query model (CNEQ), which comprises a Number-Entity Predictor and an Entity Filter. The Number-Entity Predictor can independently learn the structural and semantic features of entities and numerical values, thereby enabling better prediction of entities as well as numerical values. The Entity Filter can compare or calculate numerical values to filter out entities that meet certain numerical constraints. To evaluate our model, we generated a variety of multi-hop complex logical queries including numerical values on three widely-used Knowledge Graphs: FB15K, DB15K, and YAGO15K. Experimental results demonstrate that CNEQ achieves state-of-the-art results.

2023

Conversational recommender systems (CRS) aim to timely trace the dynamic interests of users through dialogues and generate relevant responses for item recommendations. Recently, various external knowledge bases (especially knowledge graphs) are incorporated into CRS to enhance the understanding of conversation contexts. However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. TREA constructs a multi-hierarchical scalable tree as the reasoning structure to clarify the causal relationships between mentioned entities, and fully utilizes historical conversations to generate more reasonable and suitable responses for recommended results. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach.

pdf bib abs
Miracle: Towards Personalized Dialogue Generation with Latent-Space Multiple Personal Attribute Control
Zhenyi Lu | Wei Wei | Xiaoye Qu | Xian-Ling Mao | Dangyang Chen | Jixiong Chen
Findings of the Association for Computational Linguistics: EMNLP 2023

Personalized dialogue systems aim to endow the chatbot agent with more anthropomorphic traits for human-like interactions. Previous approaches have explored explicitly user profile modeling using text descriptions, implicit derivation of user embeddings, or utilizing handicraft prompts for ChatGPT-like models. However, textual personas are limited in describing multi-faceted attributes (e.g., language style, inner character nuances), implicit embedding suffers from personality sparsity, and handicraft prompts lack fine-grained and stable controllability. Hence, these approaches may struggle with complex personalized dialogue generation tasks that require generating controllable responses with multiple personal attributes. To this end, we propose Miracle, a novel personalized dialogue generation method through MultIple PeRsonal Attributes Control within Latent-Space Energy-based Models. ttributes Control within Latent-Space Energy-based Models. Specifically, our approach first disentangles complex personality into multi-faceted attributes. Subsequently, we employ a conditional variational auto-encoder to align with the dense personalized responses within a latent joint attribute space. We have also tailored a dedicated energy function and customized the ordinary differential equations sampling method to offer flexible attribute composition and precise attribute control. Extensive experiments demonstrate that Miracle outperforms several strong baselines in terms of personality controllability and response generation quality. Our dataset and code are available at https://github.com/LZY-the-boys/MIRACLE

2022

pdf bib abs
HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold
Ruihan Zhang | Wei Wei | Xian-Ling Mao | Rui Fang | Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2022

Event detection has been suffering from constantly emerging event types with lack of sufficient data. Existing works formulate the new problem as few-shot event detection (FSED), and employ two-stage or unified models based on meta-learning to address the problem. However, these methods fall far short of expectations due to: (i) insufficient learning of discriminative representations in low-resource scenarios, and (ii) representation overlap between triggers and non-triggers. To resolve the above issues, in this paper, we propose a novel Hybrid Contrastive Learning method with a Task-Adaptive Threshold (abbreviated as HCL-TAT), which enables discriminative representation learning with a two-view contrastive loss (support-support and prototype-query), and devises an easily-adapted threshold to alleviate misidentification of triggers. Extensive experiments on the benchmark dataset FewEvent demonstrate the superiority of our method to achieve better results compared to the state-of-the-arts. All the data and codes will be available to facilitate future research.