ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.07181
  4. Cited By
Reinforcement Learning from LLM Feedback to Counteract Goal
  Misgeneralization

Reinforcement Learning from LLM Feedback to Counteract Goal Misgeneralization

14 January 2024
Houda Nait El Barj
Théophile Sautory
ArXivPDFHTML

Papers citing "Reinforcement Learning from LLM Feedback to Counteract Goal Misgeneralization"

4 / 4 papers shown
Title
Puzzle Solving using Reasoning of Large Language Models: A Survey
Puzzle Solving using Reasoning of Large Language Models: A Survey
Panagiotis Giadikiaroglou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
ELM
ReLM
LRM
11
24
0
17 Feb 2024
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
Constructing Unrestricted Adversarial Examples with Generative Models
Constructing Unrestricted Adversarial Examples with Generative Models
Yang Song
Rui Shu
Nate Kushman
Stefano Ermon
GAN
AAML
166
300
0
21 May 2018
AI safety via debate
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
196
199
0
02 May 2018
1