Congrui Huang


2025

pdf bib
STAND-Guard: A Small Task-Adaptive Content Moderation Model
Minjia Wang | Pingping Lin | Siqi Cai | Shengnan An | Shengjie Ma | Zeqi Lin | Congrui Huang | Bixiong Xu
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

Content moderation, the process of reviewing and monitoring the safety of generated content, is important for development of welcoming online platforms and responsible large language models. Content moderation contains various tasks, each with its unique requirements tailored to specific scenarios. Therefore, it is crucial to develop a model that can be easily adapted to novel or customized content moderation tasks accurately without extensive model tuning. This paper presents STAND-Guard, a Small Task-Adaptive coNtent moDeration model. The basic motivation is: by performing instruct tuning on various content moderation tasks, we can unleash the power of small language models (SLMs) on unseen (out-of-distribution) content moderation tasks. We also carefully study the effects of training tasks and model size on the efficacy of cross-task fine-tuning mechanism. Experiments demonstrate STAND-Guard is comparable to GPT-3.5-Turbo across over 40 public datasets, as well as proprietary datasets derived from real-world business scenarios. Remarkably, STAND-Guard achieved nearly equivalent results to GPT-4-Turbo on unseen English binary classification tasks.

2024

pdf bib
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui | Zhaoxiao Guo | Hang Zhao | Juanyong Duan | Congrui Huang
Findings of the Association for Computational Linguistics: EMNLP 2024

In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.

2011

pdf bib
Timeline Generation through Evolutionary Trans-Temporal Summarization
Rui Yan | Liang Kong | Congrui Huang | Xiaojun Wan | Xiaoming Li | Yan Zhang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing