ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.15241
17
0

MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning

21 April 2025
Yahan Yang
Soham Dan
Shuo Li
Dan Roth
Insup Lee
    LRM
ArXivPDFHTML
Abstract

Large Language Models (LLMs) are susceptible to adversarial attacks such as jailbreaking, which can elicit harmful or unsafe behaviors. This vulnerability is exacerbated in multilingual setting, where multilingual safety-aligned data are often limited. Thus, developing a guardrail capable of detecting and filtering unsafe content across diverse languages is critical for deploying LLMs in real-world applications. In this work, we propose an approach to build a multilingual guardrail with reasoning. Our method consists of: (1) synthetic multilingual data generation incorporating culturally and linguistically nuanced variants, (2) supervised fine-tuning, and (3) a curriculum-guided Group Relative Policy Optimization (GRPO) framework that further improves performance. Experimental results demonstrate that our multilingual guardrail consistently outperforms recent baselines across both in-domain and out-of-domain languages. The multilingual reasoning capability of our guardrail enables it to generate multilingual explanations, which are particularly useful for understanding language-specific risks and ambiguities in multilingual content moderation.

View on arXiv
@article{yang2025_2504.15241,
  title={ MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning },
  author={ Yahan Yang and Soham Dan and Shuo Li and Dan Roth and Insup Lee },
  journal={arXiv preprint arXiv:2504.15241},
  year={ 2025 }
}
Comments on this paper