ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.20850
  4. Cited By
Improving Reward Models with Synthetic Critiques

Improving Reward Models with Synthetic Critiques

31 May 2024
Zihuiwen Ye
Fraser Greenlee-Scott
Max Bartolo
Phil Blunsom
Jon Ander Campos
Matthias Gallé
    ALM
    SyDa
    LRM
ArXivPDFHTML

Papers citing "Improving Reward Models with Synthetic Critiques"

5 / 5 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
54
0
0
13 Mar 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
L. Melo
Younesse Kaddar
Phil Blunsom
S. Kamath S
Yarin Gal
LRM
44
0
0
16 Feb 2025
TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and
  Generation
TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation
Jonathan Cook
Tim Rocktaschel
Jakob Foerster
Dennis Aumiller
Alex Wang
ALM
29
9
0
04 Oct 2024
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1