Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.11062
Cited By
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
17 March 2024
Yudong Luo
Yangchen Pan
Han Wang
Philip H. S. Torr
Pascal Poupart
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization"
6 / 6 papers shown
Title
Return Capping: Sample-Efficient CVaR Policy Gradient Optimisation
Harry Mead
Clarissa Costen
Bruno Lacerda
Nick Hawes
24
0
0
29 Apr 2025
Measures of Variability for Risk-averse Policy Gradient
Yudong Luo
Yangchen Pan
Jiaqi Tan
Pascal Poupart
40
0
0
15 Apr 2025
The Perfect Blend: Redefining RLHF with Mixture of Judges
Tengyu Xu
Eryk Helenowski
Karthik Abinav Sankararaman
Di Jin
Kaiyan Peng
...
Gabriel Cohen
Yuandong Tian
Hao Ma
Sinong Wang
Han Fang
38
9
0
30 Sep 2024
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
214
843
0
12 Oct 2021
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
136
185
0
08 May 2020
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
Yinlam Chow
Aviv Tamar
Shie Mannor
Marco Pavone
67
310
0
06 Jun 2015
1