ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2510.02172
  4. Cited By
RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

2 October 2025
Zhaoning Yu
Will Su
Leitian Tao
Haozhu Wang
Aashu Singh
Hanchao Yu
Jianyu Wang
Hongyang Gao
Weizhe Yuan
Jason Weston
Ping Yu
Jing Xu
    OffRLLRM
ArXiv (abs)PDFHTMLHuggingFace (6 upvotes)

Papers citing "RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization"

0 / 0 papers shown
Title

No papers found