ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.18729
  4. Cited By
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement
  Learning

Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning

29 May 2024
Tianle Zhang
Jiayi Guan
Lin Zhao
Yihang Li
Dongjiang Li
Zecui Zeng
Lei Sun
Yue Chen
Xuelong Wei
Lusong Li
Xiaodong He
ArXivPDFHTML

Papers citing "Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning"

8 / 8 papers shown
Title
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury
Anush Kini
Nagarajan Natarajan
22
54
0
01 Mar 2024
Offline Reinforcement Learning via High-Fidelity Generative Behavior
  Modeling
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Huayu Chen
Cheng Lu
Chengyang Ying
Hang Su
Jun Zhu
DiffM
OffRL
76
103
0
29 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
203
627
0
12 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified
  Q-Ensemble
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
92
261
0
04 Oct 2021
Conservative Offline Distributional Reinforcement Learning
Conservative Offline Distributional Reinforcement Learning
Yecheng Jason Ma
Dinesh Jayaraman
Osbert Bastani
OffRL
55
77
0
12 Jul 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
194
412
0
16 Feb 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
321
1,662
0
04 May 2020
1