Erxin Yu


2024

pdf bib
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference
Erxin Yu | Jing Li | Ming Liao | Siqi Wang | Gao Zuchen | Fei Mi | Lanqing Hong
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

As large language models (LLMs) constantly evolve, ensuring their safety remains a critical research issue. Previous red teaming approaches for LLM safety have primarily focused on single prompt attacks or goal hijacking. To the best of our knowledge, we are the first to study LLM safety in multi-turn dialogue coreference. We created a dataset of 1,400 questions across 14 categories, each featuring multi-turn coreference safety attacks. We then conducted detailed evaluations on five widely used open-source LLMs. The results indicated that under multi-turn coreference safety attacks, the highest attack success rate was 56% with the LLaMA2-Chat-7b model, while the lowest was 13.9% with the Mistral-7B-Instruct model. These findings highlight the safety vulnerabilities in LLMs during dialogue coreference interactions.

pdf bib
RePALM: Popular Quote Tweet Generation via Auto-Response Augmentation
Erxin Yu | Jing Li | Chunpu Xu
Findings of the Association for Computational Linguistics: ACL 2024

A quote tweet enables users to share others’ content while adding their own commentary. In order to enhance public engagement through quote tweets, we investigate the task of generating popular quote tweets. This task aims to produce quote tweets that garner higher popularity, as indicated by increased likes, replies, and retweets. Despite the impressive language generation capabilities of large language models (LLMs), there has been limited research on how LLMs can effectively learn the popularity of text to better engage the public. Therefore, we introduce a novel approach called Response-augmented Popularity-Aligned Language Model (RePALM), which aligns language generation with popularity by leveraging insights from augmented auto-responses provided by readers. We utilize the Proximal Policy Optimization framework with a dual-reward mechanism to jointly optimize for the popularity of the quote tweet and its consistency with the auto-responses. In our experiments, we collected two datasets consisting of quote tweets containing external links and those referencing others’ tweets. Extensive results demonstrate the superiority of RePALM over advanced language models that do not incorporate response augmentation.

pdf bib
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
Erxin Yu | Jing Li | Chunpu Xu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Social media platforms are daily exhibiting millions of events. To preliminarily predict the mainstream public reaction to these events, we study trendy response prediction to automatically generate top-liked user replies to social media events. While previous works focus on generating responses without factoring in popularity, we propose Popularity-Aligned Language Models (PopALM) to distinguish responses liked by a larger audience through reinforcement learning. Recognizing the noisy labels from user “likes”, we tailor-make curriculum learning in proximal policy optimization (PPO) to help models capture the essential samples for easy-to-hard training. In experiments, we build a large-scale Weibo dataset for trendy response prediction, and its results show that PopALM can help boost the performance of advanced language models.

2022

pdf bib
Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables
Erxin Yu | Lan Du | Yuan Jin | Zhepei Wei | Yi Chang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recently, discrete latent variable models have received a surge of interest in both Natural Language Processing (NLP) and Computer Vision (CV), attributed to their comparable performance to the continuous counterparts in representation learning, while being more interpretable in their predictions. In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. Compared with previous models limited to local semantic contexts, our model can explore richer semantic information via topic modeling. We further boost the performance of semantic similarity by injecting the quantized representation into a transformer-based language model with a well-designed semantic-driven attention mechanism. We demonstrate, through extensive experiments across various English language datasets, that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.

pdf bib
Towards Unified Representations of Knowledge Graph and Expert Rules for Machine Learning and Reasoning
Zhepei Wei | Yue Wang | Jinnan Li | Zhining Liu | Erxin Yu | Yuan Tian | Xin Wang | Yi Chang
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

With a knowledge graph and a set of if-then rules, can we reason about the conclusions given a set of observations? In this work, we formalize this question as the cognitive inference problem, and introduce the Cognitive Knowledge Graph (CogKG) that unifies two representations of heterogeneous symbolic knowledge: expert rules and relational facts. We propose a general framework in which the unified knowledge representations can perform both learning and reasoning. Specifically, we implement the above framework in two settings, depending on the availability of labeled data. When no labeled data are available for training, the framework can directly utilize symbolic knowledge as the decision basis and perform reasoning. When labeled data become available, the framework casts symbolic knowledge as a trainable neural architecture and optimizes the connection weights among neurons through gradient descent. Empirical study on two clinical diagnosis benchmarks demonstrates the superiority of the proposed method over time-tested knowledge-driven and data-driven methods, showing the great potential of the proposed method in unifying heterogeneous symbolic knowledge, i.e., expert rules and relational facts, as the substrate of machine learning and reasoning models.

2020

pdf bib
ToHRE: A Top-Down Classification Strategy with Hierarchical Bag Representation for Distantly Supervised Relation Extraction
Erxin Yu | Wenjuan Han | Yuan Tian | Yi Chang
Proceedings of the 28th International Conference on Computational Linguistics

Distantly Supervised Relation Extraction (DSRE) has proven to be effective to find relational facts from texts, but it still suffers from two main problems: the wrong labeling problem and the long-tail problem. Most of the existing approaches address these two problems through flat classification, which lacks hierarchical information of relations. To leverage the informative relation hierarchies, we formulate DSRE as a hierarchical classification task and propose a novel hierarchical classification framework, which extracts the relation in a top-down manner. Specifically, in our proposed framework, 1) we use a hierarchically-refined representation method to achieve hierarchy-specific representation; 2) a top-down classification strategy is introduced instead of training a set of local classifiers. The experiments on NYT dataset demonstrate that our approach significantly outperforms other state-of-the-art approaches, especially for the long-tail problem.