ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.22638
  4. Cited By
Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

26 September 2025
Renjie Luo
Zichen Liu
Xiangyan Liu
Chao Du
Min Lin
Wenhu Chen
Wei Lu
Tianyu Pang
    OffRL
ArXiv (abs)PDFHTMLHuggingFace (62 upvotes)Github (2045★)

Papers citing "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

1 / 1 papers shown
Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences
Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences
Batu El
J. Zou
158
3
0
07 Oct 2025
1
Page 1 of 1