ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.00177
  4. Cited By
Advantage-Weighted Regression: Simple and Scalable Off-Policy
  Reinforcement Learning

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

1 October 2019
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning"

50 / 404 papers shown
Title
Beyond Conservatism: Diffusion Policies in Offline Multi-agent
  Reinforcement Learning
Beyond Conservatism: Diffusion Policies in Offline Multi-agent Reinforcement Learning
Zhuoran Li
Ling Pan
Longbo Huang
DiffM
OffRL
25
7
0
04 Jul 2023
Design from Policies: Conservative Test-Time Adaptation for Offline
  Policy Optimization
Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization
Jinxin Liu
Hongyin Zhang
Zifeng Zhuang
Yachen Kang
Donglin Wang
Bin Wang
OffRL
49
8
0
26 Jun 2023
Provably Convergent Policy Optimization via Metric-aware Trust Region
  Methods
Provably Convergent Policy Optimization via Metric-aware Trust Region Methods
Jun Song
Niao He
Lijun Ding
Chaoyue Zhao
41
3
0
25 Jun 2023
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory
  Weighting
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting
Zhang-Wei Hong
Pulkit Agrawal
Rémi Tachet des Combes
Romain Laroche
OffRL
45
17
0
22 Jun 2023
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement
  Learning
Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning
Jinxin Liu
Ziqi Zhang
Zhenyu Wei
Zifeng Zhuang
Yachen Kang
Sibo Gai
Donglin Wang
OffRL
35
16
0
22 Jun 2023
SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling
SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling
Jesse Zhang
Karl Pertsch
Jiahui Zhang
Joseph J. Lim
LM&Ro
45
17
0
20 Jun 2023
Datasets and Benchmarks for Offline Safe Reinforcement Learning
Datasets and Benchmarks for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Haohong Lin
Yi-Fan Yao
Jiacheng Zhu
...
Hanjiang Hu
Wenhao Yu
Tingnan Zhang
Jie Tan
Ding Zhao
OffRL
32
37
0
15 Jun 2023
Offline Multi-Agent Reinforcement Learning with Coupled Value
  Factorization
Offline Multi-Agent Reinforcement Learning with Coupled Value Factorization
Xiangsen Wang
Xianyuan Zhan
OffRL
34
5
0
15 Jun 2023
Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
Kai-Wen Zhao
Yi Ma
Jianye Hao
Jinyi Liu
Yan Zheng
Zhaopeng Meng
OffRL
OnRL
25
12
0
12 Jun 2023
Decoupled Prioritized Resampling for Offline RL
Decoupled Prioritized Resampling for Offline RL
Yang Yue
Bingyi Kang
Xiao Ma
Qisen Yang
Gao Huang
S. Song
Shuicheng Yan
OffRL
27
0
0
08 Jun 2023
Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
Linjie Xu
Zhengyao Jiang
Jinyu Wang
Lei Song
Jiang Bian
OffRL
48
0
0
06 Jun 2023
Boosting Offline Reinforcement Learning with Action Preference Query
Boosting Offline Reinforcement Learning with Action Preference Query
Qisen Yang
Shenzhi Wang
Matthieu Lin
S. Song
Gao Huang
OffRL
24
9
0
06 Jun 2023
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
  Offline Data
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
Chongyi Zheng
Benjamin Eysenbach
Homer Walke
Patrick Yin
Kuan Fang
Ruslan Salakhutdinov
Sergey Levine
SSL
OffRL
44
4
0
06 Jun 2023
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Banghua Zhu
Hiteshi Sharma
Felipe Vieira Frujeri
Shi Dong
Chenguang Zhu
Michael I. Jordan
Jiantao Jiao
OSLM
36
39
0
04 Jun 2023
Improving and Benchmarking Offline Reinforcement Learning Algorithms
Improving and Benchmarking Offline Reinforcement Learning Algorithms
Bingyi Kang
Xiao Ma
Yi-Ren Wang
Yang Yue
Shuicheng Yan
OffRL
16
9
0
01 Jun 2023
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive
  Control
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control
Rohan Chitnis
Yingchen Xu
B. Hashemi
Lucas Lehnert
Ürün Dogan
Zheqing Zhu
Olivier Delalleau
OffRL
34
9
0
01 Jun 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
Efficient Diffusion Policies for Offline Reinforcement Learning
Bingyi Kang
Xiao Ma
Chao Du
Tianyu Pang
Shuicheng Yan
OffRL
42
63
0
31 May 2023
Offline Meta Reinforcement Learning with In-Distribution Online
  Adaptation
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
Jianhao Wang
Jin Zhang
Haozhe Jiang
Junyu Zhang
Liwei Wang
Chongjie Zhang
OffRL
31
9
0
31 May 2023
What is Essential for Unseen Goal Generalization of Offline
  Goal-conditioned RL?
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
Rui Yang
Yong Lin
Xiaoteng Ma
Haotian Hu
Chongjie Zhang
Tong Zhang
OffRL
34
23
0
30 May 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
126
3,433
0
29 May 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement
  Learning
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
OffRL
OnRL
45
19
0
25 May 2023
Matrix Estimation for Offline Reinforcement Learning with Low-Rank
  Structure
Matrix Estimation for Offline Reinforcement Learning with Low-Rank Structure
Xumei Xi
Chao Yu
Yudong Chen
OffRL
33
0
0
24 May 2023
Inverse Preference Learning: Preference-based RL without a Reward
  Function
Inverse Preference Learning: Preference-based RL without a Reward Function
Joey Hejna
Dorsa Sadigh
OffRL
37
48
0
24 May 2023
Leftover Lunch: Advantage-based Offline Reinforcement Learning for
  Language Models
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
Ashutosh Baheti
Ximing Lu
Faeze Brahman
Ronan Le Bras
Maarten Sap
Mark O. Riedl
38
9
0
24 May 2023
OER: Offline Experience Replay for Continual Offline Reinforcement
  Learning
OER: Offline Experience Replay for Continual Offline Reinforcement Learning
Sibo Gai
Donglin Wang
Li He
CLL
OffRL
62
3
0
23 May 2023
Training Diffusion Models with Reinforcement Learning
Training Diffusion Models with Reinforcement Learning
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
44
320
0
22 May 2023
Knowledge Transfer from Teachers to Learners in Growing-Batch
  Reinforcement Learning
Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
P. Emedom-Nnamdi
A. Friesen
Bobak Shahriari
Nando de Freitas
Matthew W. Hoffman
OffRL
31
0
0
05 May 2023
Federated Ensemble-Directed Offline Reinforcement Learning
Federated Ensemble-Directed Offline Reinforcement Learning
Desik Rengarajan
N. Ragothaman
D. Kalathil
S. Shakkottai
OffRL
35
1
0
04 May 2023
Distance Weighted Supervised Learning for Offline Interaction Data
Distance Weighted Supervised Learning for Offline Interaction Data
Joey Hejna
Jensen Gao
Dorsa Sadigh
OffRL
38
13
0
26 Apr 2023
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling
  in Offline Reinforcement Learning
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning
Cheng Lu
Huayu Chen
Jianfei Chen
Hang Su
Chongxuan Li
Jun Zhu
DiffM
OffRL
27
59
0
25 Apr 2023
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion
  Policies
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies
Philippe Hansen-Estruch
Ilya Kostrikov
Michael Janner
J. Kuba
Sergey Levine
OffRL
34
130
0
20 Apr 2023
On Context Distribution Shift in Task Representation Learning for
  Offline Meta RL
On Context Distribution Shift in Task Representation Learning for Offline Meta RL
Chenyang Zhao
Zihao Zhou
Bing-Quan Liu
OffRL
34
3
0
01 Apr 2023
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs
  and Practical Solutions
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Yicheng Luo
Jackie Kay
Edward Grefenstette
M. Deisenroth
OffRL
OnRL
27
15
0
30 Mar 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
41
73
0
28 Mar 2023
A Survey of Demonstration Learning
A Survey of Demonstration Learning
André Rosa de Sousa Porfírio Correia
Luís A. Alexandre
OffRL
38
18
0
20 Mar 2023
HIVE: Harnessing Human Feedback for Instructional Visual Editing
HIVE: Harnessing Human Feedback for Instructional Visual Editing
Shu Zhen Zhang
Xinyi Yang
Yihao Feng
Can Qin
Chia-Chih Chen
...
Haiquan Wang
Silvio Savarese
Stefano Ermon
Caiming Xiong
Ran Xu
28
105
0
16 Mar 2023
Goal-conditioned Offline Reinforcement Learning through State Space
  Partitioning
Goal-conditioned Offline Reinforcement Learning through State Space Partitioning
Mianchu Wang
Yue Jin
Giovanni Montana
OffRL
23
3
0
16 Mar 2023
Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in
  Unstable Conditions
Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in Unstable Conditions
Yunchu Zhang
Liyiming Ke
Abhay Deshpande
Abhishek Gupta
S. Srinivasa
OffRL
19
8
0
09 Mar 2023
Learning Exploration Strategies to Solve Real-World Marble Runs
Learning Exploration Strategies to Solve Real-World Marble Runs
Alisa Allaire
C. Atkeson
34
0
0
08 Mar 2023
Graph Decision Transformer
Graph Decision Transformer
Shengchao Hu
Li Shen
Ya Zhang
Dacheng Tao
OffRL
41
15
0
07 Mar 2023
Learning to Control Autonomous Fleets from Observation via Offline
  Reinforcement Learning
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning
Carolin Schmidt
Daniele Gammelli
Francisco Câmara Pereira
Filipe Rodrigues
OffRL
24
4
0
28 Feb 2023
The In-Sample Softmax for Offline Reinforcement Learning
The In-Sample Softmax for Offline Reinforcement Learning
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
31
26
0
28 Feb 2023
Behavior Proximal Policy Optimization
Behavior Proximal Policy Optimization
Zifeng Zhuang
Kun Lei
Jinxin Liu
Donglin Wang
Yilang Guo
OffRL
35
34
0
22 Feb 2023
Efficient Communication via Self-supervised Information Aggregation for
  Online and Offline Multi-agent Reinforcement Learning
Efficient Communication via Self-supervised Information Aggregation for Online and Offline Multi-agent Reinforcement Learning
Cong Guan
F. Chen
Lei Yuan
Zongzhang Zhang
Yang Yu
OffRL
39
4
0
19 Feb 2023
Swapped goal-conditioned offline reinforcement learning
Swapped goal-conditioned offline reinforcement learning
Wenyan Yang
Huiling Wang
Dingding Cai
Joni Pajarinen
Joni-Kristen Kämäräinen
OffRL
OnRL
41
1
0
17 Feb 2023
Pretraining Language Models with Human Preferences
Pretraining Language Models with Human Preferences
Tomasz Korbak
Kejian Shi
Angelica Chen
Rasika Bhalerao
C. L. Buckley
Jason Phang
Sam Bowman
Ethan Perez
ALM
SyDa
36
209
0
16 Feb 2023
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Yi-Fan Yao
Zhepeng Cen
Wenhao Yu
Tingnan Zhang
Ding Zhao
OffRL
33
47
0
14 Feb 2023
Conservative State Value Estimation for Offline Reinforcement Learning
Conservative State Value Estimation for Offline Reinforcement Learning
Liting Chen
Jie Yan
Zhengdao Shao
Lu Wang
Qingwei Lin
Saravan Rajmohan
Thomas Moscibroda
Dongmei Zhang
OffRL
26
6
0
14 Feb 2023
ALAN: Autonomously Exploring Robotic Agents in the Real World
ALAN: Autonomously Exploring Robotic Agents in the Real World
Russell Mendonca
Shikhar Bahl
Deepak Pathak
LM&Ro
41
20
0
13 Feb 2023
Identifying Expert Behavior in Offline Training Datasets Improves
  Behavioral Cloning of Robotic Manipulation Policies
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies
Qiang-qiang Wang
Robert McCarthy
David Córdova Bulens
Francisco Roldan Sanchez
Kevin McGuinness
Noel E. O'Connor
S. Redmond
OffRL
35
3
0
30 Jan 2023
Previous
123456789
Next