ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.15956
  4. Cited By
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
v1v2v3 (latest)

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

International Conference on Machine Learning (ICML), 2022
29 November 2022
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Offline Reinforcement Learning with Closed-Form Policy Improvement Operators"

11 / 11 papers shown
Title
Symmetric Behavior Regularized Policy Optimization
Symmetric Behavior Regularized Policy Optimization
Lingwei Zhu
Zheng Chen
Han Wang
Yukie Nagai
Martha White
OffRL
96
0
0
06 Aug 2025
Policy-Based Trajectory Clustering in Offline Reinforcement Learning
Policy-Based Trajectory Clustering in Offline Reinforcement Learning
Hao Hu
Xinqi Wang
Simon S. Du
OffRL
266
0
0
10 Jun 2025
Behavior Preference Regression for Offline Reinforcement Learning
Padmanaba Srinivasan
William J. Knottenbelt
OffRL
154
0
0
02 Mar 2025
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic
  Management in Network Simulation
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network SimulationNeural Information Processing Systems (NeurIPS), 2024
Momin Haider
Ming Yin
Menglei Zhang
Arpit Gupta
Jing Zhu
Yu-Xiang Wang
OffRL
142
2
0
30 Oct 2024
Transcendence: Generative Models Can Outperform The Experts That Train
  Them
Transcendence: Generative Models Can Outperform The Experts That Train Them
Edwin Zhang
Vincent Zhu
Sham Kakade
Anat Kleiman
Benjamin L. Edelman
Milind Tambe
Sham Kakade
Eran Malach
405
21
0
17 Jun 2024
Value Improved Actor Critic Algorithms
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
Mustafa Mert Celikok
M. Spaan
Wendelin Bohmer
OffRL
319
0
0
03 Jun 2024
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model
  with Mixed Reward Feedback
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li
Weixi Feng
Tsu-Jui Fu
Xinyi Wang
Sugato Basu
Wenhu Chen
William Y. Wang
VGen
216
60
0
29 May 2024
Offline Reinforcement Learning with Behavioral Supervisor Tuning
Offline Reinforcement Learning with Behavioral Supervisor Tuning
Padmanaba Srinivasan
William J. Knottenbelt
OffRL
165
4
0
25 Apr 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences
  without Tuning and Feedback
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and FeedbackInternational Conference on Machine Learning (ICML), 2024
Songyang Gao
Qiming Ge
Wei Shen
Jiajun Sun
Junjie Ye
...
Yicheng Zou
Zhi Chen
Hang Yan
Tao Gui
Dahua Lin
159
20
0
21 Jan 2024
Mastering Robot Manipulation with Multimodal Prompts through Pretraining
  and Multi-task Fine-tuning
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Jiachen Li
Qiaozi Gao
Michael Johnston
Xiaofeng Gao
Xuehai He
Suhaila Shakiah
Hangjie Shi
R. Ghanadan
William Y. Wang
LM&Ro
300
17
0
14 Oct 2023
Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
Linjie Xu
Zhengyao Jiang
Jinyu Wang
Lei Song
Jiang Bian
OffRL
191
0
0
06 Jun 2023
1