Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.02585
Cited By
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
3 August 2023
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Dinesh Manocha
Huazheng Wang
Mengdi Wang
Furong Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback"
5 / 5 papers shown
Title
Learning to Steer Markovian Agents under Model Uncertainty
Jiawei Huang
Vinzenz Thoma
Zebang Shen
H. Nax
Niao He
26
2
0
14 Jul 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
33
2
0
30 May 2024
LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace
Bin Gao
Yan Yang
Ya-xiang Yuan
34
2
0
04 Apr 2024
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
28
17
0
28 Jan 2022
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
48
59
0
21 Jul 2020
1