Zheng Hui


2025

pdf bib
PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent
Jiateng Liu | Lin Ai | Zizhou Liu | Payam Karisani | Zheng Hui | Yi Fung | Preslav Nakov | Julia Hirschberg | Heng Ji
Proceedings of the 31st International Conference on Computational Linguistics

Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce PropaInsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. PropaInsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present PropaGaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but PropaGaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, PropaGaze complements limited human-annotated data in data-sparse and cross-domain scenarios, demonstrating its potential for comprehensive and generalizable propaganda analysis.

2024

pdf bib
Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension
Lin Ai | Zheng Hui | Zizhou Liu | Julia Hirschberg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

pdf bib
Defending Against Social Engineering Attacks in the Age of LLMs
Lin Ai | Tharindu Sandaruwan Kumarage | Amrita Bhattacharjee | Zizhou Liu | Zheng Hui | Michael S. Davinroy | James Cook | Laura Cassani | Kirill Trapeznikov | Matthias Kirchner | Arslan Basharat | Anthony Hoogs | Joshua Garland | Huan Liu | Julia Hirschberg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

pdf bib
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui | Zhaoxiao Guo | Hang Zhao | Juanyong Duan | Congrui Huang
Findings of the Association for Computational Linguistics: EMNLP 2024

In different NLP tasks, detecting harmful content is crucial for online environments, especially with the growing influence of social media. However, previous research has two main issues: 1) a lack of data in low-resource settings, and 2) inconsistent definitions and criteria for judging harmful content, requiring classification models to be robust to spurious features and diverse. We propose Toxicraft, a novel framework for synthesizing datasets of harmful information to address these weaknesses. With only a small amount of seed data, our framework can generate a wide variety of synthetic, yet remarkably realistic, examples of toxic information. Experimentation across various datasets showcases a notable enhancement in detection model robustness and adaptability, surpassing or close to the gold labels.