IPS: In-Prompt Process Supervision for Short Video Content Moderation

Mingchao Liu; Yu Sun; Ruixiao Sun; Xin Dong; Xiang Shen; Hongwei Wang; Hongyu Xiong; Yang Song

IPS: In-Prompt Process Supervision for Short Video Content Moderation

Mingchao Liu, Yu Sun, Ruixiao Sun, Xin Dong, Xiang Shen, Hongwei Wang, Hongyu Xiong, Yang Song

Abstract

Multimodal large language models (MLLMs) are effective at capturing the semantics of short video content; however, they often fail to attend to the policy-specific details required for reliable content moderation.To address this limitation, we introduce IPS, a novel framework that integrates In-prompt Process Supervision into MLLMs by introducing sequential reasoning over ancillary questions during fine-tuning. IPS consistently outperforms baseline MLLMs on public and proprietary benchmarks.Moreover, replacing human-annotated ancillary labels with MLLM-generated ones results in only marginal performance degradation, demonstrating robustness to noisy supervision and strong scalability with model-generated annotations.These findings establish IPS as a scalable and effective solution for complex multimodal classification in large-scale industrial settings.

Anthology ID:: 2026.acl-industry.89
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1277–1288
Language:
URL:: https://aclanthology.org/2026.acl-industry.89/
DOI:
Bibkey:
Cite (ACL):: Mingchao Liu, Yu Sun, Ruixiao Sun, Xin Dong, Xiang Shen, Hongwei Wang, Hongyu Xiong, and Yang Song. 2026. IPS: In-Prompt Process Supervision for Short Video Content Moderation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1277–1288, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: IPS: In-Prompt Process Supervision for Short Video Content Moderation (Liu et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-industry.89.pdf

PDF Cite Search Fix data