ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.02585
  4. Cited By
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning
  from Human Feedback

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback

3 August 2023
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Dinesh Manocha
Huazheng Wang
Mengdi Wang
Furong Huang
ArXivPDFHTML

Papers citing "PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback"

5 / 5 papers shown
Title
Learning to Steer Markovian Agents under Model Uncertainty
Learning to Steer Markovian Agents under Model Uncertainty
Jiawei Huang
Vinzenz Thoma
Zebang Shen
H. Nax
Niao He
26
2
0
14 Jul 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
33
2
0
30 May 2024
LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace
LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace
Bin Gao
Yan Yang
Ya-xiang Yuan
34
2
0
04 Apr 2024
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
28
17
0
28 Jan 2022
On Linear Convergence of Policy Gradient Methods for Finite MDPs
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
48
59
0
21 Jul 2020
1