Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.00177
Cited By
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
1 October 2019
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning"
50 / 404 papers shown
Title
Hierarchical Multi-agent Meta-Reinforcement Learning for Cross-channel Bidding
Shenghong He
Chao Yu
41
0
0
26 Dec 2024
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
109
1
0
22 Dec 2024
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
Max Sobol Mark
Tian Gao
Georgia Gabriela Sampaio
Mohan Kumar Srirama
Archit Sharma
Chelsea Finn
Aviral Kumar
OffRL
OnRL
106
4
0
09 Dec 2024
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
Tewodros Ayalew
Xiao Zhang
Kevin Yuanbo Wu
Tianchong Jiang
Michael Maire
Matthew R. Walter
OffRL
83
1
0
26 Nov 2024
AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers
Jake Grigsby
Justin Sasek
Samyak Parajuli
Daniel Adebi
Amy Zhang
Yuke Zhu
OffRL
33
3
0
17 Nov 2024
Vision Language Models are In-Context Value Learners
Yecheng Jason Ma
Joey Hejna
Ayzaan Wahid
Chuyuan Fu
Dhruv Shah
...
Dinesh Jayaraman
Wenhao Yu
Tingnan Zhang
Dorsa Sadigh
Fei Xia
65
6
0
07 Nov 2024
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
Rui Yang
Jie Wang
Guoping Wu
Yangqiu Song
AAML
OffRL
56
1
0
01 Nov 2024
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
Kai Yan
Alex Schwing
Yu-xiong Wang
OffRL
OnRL
41
0
0
31 Oct 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
44
0
0
27 Oct 2024
Efficient Diversity-based Experience Replay for Deep Reinforcement Learning
Kaiyan Zhao
Yiming Wang
Yuyang Chen
Yan Li
Leong Hou U
Xiaoguang Niu
44
1
0
27 Oct 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
67
10
0
26 Oct 2024
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
Yixiu Mao
Qi Wang
Chen Chen
Yun Qu
Xiangyang Ji
OffRL
53
6
0
25 Oct 2024
Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning
Hanlin Yang
Jian Yao
Weiming Liu
Qing Wang
Hanmin Qin
...
Hongwu Chen
Juchao Zhuo
Qiang Fu
Yang Wei
Haobo Fu
34
1
0
21 Oct 2024
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
Jifeng Hu
Sili Huang
Li Shen
Zhejian Yang
Shengchao Hu
Shisong Tang
Hechang Chen
Yi Chang
Dacheng Tao
Lichao Sun
OffRL
44
0
0
21 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
79
14
0
17 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
239
1
0
11 Oct 2024
Reward Learning From Preference With Ties
Jinsong Liu
Dongdong Ge
Ruihao Zhu
39
3
0
05 Oct 2024
Robust Offline Imitation Learning from Diverse Auxiliary Data
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
Amit K. Roy-Chowdhury
OffRL
31
1
0
04 Oct 2024
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Yun Qu
Boyuan Wang
Yuhang Jiang
Jianzhun Shao
Yixiu Mao
Cheems Wang
Chang Liu
Xiangyang Ji
51
4
0
03 Oct 2024
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
The Viet Bui
Thanh Hong Nguyen
Tien Mai
OffRL
38
0
0
02 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRL
OnRL
LM&Ro
50
3
0
01 Oct 2024
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance
Renming Huang
Shaochong Liu
Yunqiang Pei
Peng Wang
Guoqing Wang
Yang Yang
Hengtao Shen
OffRL
45
0
0
06 Sep 2024
Diffusion Policy Policy Optimization
Allen Z. Ren
Justin Lidard
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Anirudha Majumdar
Benjamin Burchfiel
Hongkai Dai
Max Simchowitz
62
38
0
01 Sep 2024
Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning
Minjong Yoo
Sangwoo Cho
Honguk Woo
OffRL
45
10
0
28 Aug 2024
Unsupervised-to-Online Reinforcement Learning
Junsu Kim
Seohong Park
Sergey Levine
OnRL
65
3
0
27 Aug 2024
F1tenth Autonomous Racing With Offline Reinforcement Learning Methods
Prajwal Koirala
Cody Fleming
OffRL
45
1
0
08 Aug 2024
SelfBC: Self Behavior Cloning for Offline Reinforcement Learning
Shirong Liu
Chenjia Bai
Zixian Guo
Hao Zhang
Gaurav Sharma
Yang Liu
OffRL
40
2
0
04 Aug 2024
How to Choose a Reinforcement-Learning Algorithm
Fabian Bongratz
Vladimir Golkov
Lukas Mautner
Luca Della Libera
Frederik Heetmeyer
Felix Czaja
Julian Rodemann
Daniel Cremers
34
1
0
30 Jul 2024
Offline Imitation Learning Through Graph Search and Retrieval
Zhao-Heng Yin
Pieter Abbeel
OffRL
55
3
0
22 Jul 2024
Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review
Masatoshi Uehara
Yulai Zhao
Tommaso Biancalani
Sergey Levine
71
22
0
18 Jul 2024
New Desiderata for Direct Preference Optimization
Xiangkun Hu
Tong He
David Wipf
61
2
0
12 Jul 2024
Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control
Huayu Chen
Kaiwen Zheng
Hang Su
Jun Zhu
58
1
0
12 Jul 2024
Gradient Boosting Reinforcement Learning
Benjamin Fuhrer
Chen Tessler
Gal Dalal
OffRL
AI4CE
57
3
0
11 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
47
12
0
06 Jul 2024
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao
Yucheng Xin
Silang Wu
Longxiang He
Zichen Yan
Junbo Tan
Xueqian Wang
OffRL
69
0
0
06 Jul 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
51
9
0
24 Jun 2024
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
Amrith Rajagopal Setlur
Saurabh Garg
Xinyang Geng
Naman Garg
Virginia Smith
Aviral Kumar
47
48
0
20 Jun 2024
Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing
Xinbo Zhao
Yingxue Zhang
Xin Zhang
Yu Yang
Yiqun Xie
Yanhua Li
Jun Luo
OffRL
45
2
0
20 Jun 2024
Equivariant Offline Reinforcement Learning
Arsh Tangri
Ondrej Biza
Dian Wang
David Klee
Owen Howell
Robert Platt
OffRL
44
3
0
20 Jun 2024
Efficient Offline Reinforcement Learning: The Critic is Critical
Adam Jelley
Trevor A. McInroe
Sam Devlin
Amos Storkey
OffRL
50
1
0
19 Jun 2024
Is Value Learning Really the Main Bottleneck in Offline RL?
Seohong Park
Kevin Frans
Sergey Levine
Aviral Kumar
OffRL
58
9
0
13 Jun 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
Joni Pajarinen
OffRL
58
1
0
12 Jun 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
78
7
0
08 Jun 2024
Strategically Conservative Q-Learning
Yutaka Shimizu
Joey Hong
Sergey Levine
Masayoshi Tomizuka
OffRL
OnRL
50
0
0
06 Jun 2024
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning
Yu Zhang
Rui Yu
Zhipeng Yao
Wenyuan Zhang
Jun Wang
Liming Zhang
OffRL
60
0
0
05 Jun 2024
Amortizing intractable inference in diffusion models for vision, language, and control
S. Venkatraman
Moksh Jain
Luca Scimeca
Minsu Kim
Marcin Sendera
...
Alexandre Adam
Jarrid Rector-Brooks
Yoshua Bengio
Glen Berseth
Nikolay Malkin
70
26
0
31 May 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
61
3
0
31 May 2024
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu
Yang Li
Yixing Lan
Hao Gao
Wei Pan
Xin Xu
OffRL
38
5
0
30 May 2024
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Hengkai Tan
Songming Liu
Kai Ma
Chengyang Ying
Xingxing Zhang
Hang Su
Jun Zhu
47
2
0
30 May 2024
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Shenao Zhang
Donghan Yu
Hiteshi Sharma
Ziyi Yang
Shuohang Wang
Hany Hassan
Zhaoran Wang
LRM
53
28
0
29 May 2024
Previous
1
2
3
4
5
6
7
8
9
Next