13

GuardReasoner-Omni: A Reasoning-based Multi-modal Guardrail for Text, Image, and Video

Zhenhao Zhu
Yue Liu
Yanpei Guo
Wenjie Qu
Cancan Chen
Yufei He
Yibo Li
Yulin Chen
Tianyi Wu
Huiying Xu
Xinzhong Zhu
Jiaheng Zhang
Main:8 Pages
10 Figures
Bibliography:5 Pages
5 Tables
Appendix:6 Pages
Abstract

We present GuardReasoner-Omni, a reasoning-based guardrail model designed to moderate text, image, and video data. First, we construct a comprehensive training corpus comprising 148k samples spanning these three modalities. Our training pipeline follows a two-stage paradigm to incentivize the model to deliberate before making decisions: (1) conducting SFT to cold-start the model with explicit reasoning capabilities and structural adherence; and (2) performing RL, incorporating an error-driven exploration reward to incentivize deeper reasoning on hard samples. We release a suite of models scaled at 2B and 4B parameters. Extensive experiments demonstrate that GuardReasoner-Omni achieves superior performance compared to existing state-of-the-art baselines across various guardrail benchmarks. Notably, GuardReasoner-Omni (2B) significantly surpasses the runner-up by 5.3% F1 score.

View on arXiv
Comments on this paper