ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.06387
  4. Cited By
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement
  Learning from Observations

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

12 April 2019
Daniel S. Brown
Wonjoon Goo
P. Nagarajan
S. Niekum
ArXivPDFHTML

Papers citing "Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations"

50 / 80 papers shown
Title
Reinforcement Learning from Multi-level and Episodic Human Feedback
Reinforcement Learning from Multi-level and Episodic Human Feedback
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
50
0
0
20 Apr 2025
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Zhanpeng He
Yifeng Cao
M. Ciocarlie
61
0
0
26 Feb 2025
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Hao Sun
M. Schaar
94
14
0
28 Jan 2025
Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation
N. Dennler
Stefanos Nikolaidis
Maja J. Matarić
141
0
0
03 Jan 2025
Incremental Learning for Robot Shared Autonomy
Incremental Learning for Robot Shared Autonomy
Yiran Tao
Guixiu Qiao
Dan Ding
Zackory Erickson
CLL
35
0
0
08 Oct 2024
Control-oriented Clustering of Visual Latent Representation
Control-oriented Clustering of Visual Latent Representation
Han Qi
Haocheng Yin
Heng Yang
SSL
50
2
0
07 Oct 2024
Robust Offline Imitation Learning from Diverse Auxiliary Data
Robust Offline Imitation Learning from Diverse Auxiliary Data
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
A. Roy-Chowdhury
OffRL
24
1
0
04 Oct 2024
Online Control-Informed Learning
Online Control-Informed Learning
Zihao Liang
Tianyu Zhou
Zehui Lu
Shaoshuai Mou
33
1
0
04 Oct 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
46
2
0
30 May 2024
A Unified Linear Programming Framework for Offline Reward Learning from
  Human Demonstrations and Feedback
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
OffRL
41
1
0
20 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
M. E. Taylor
OffRL
43
2
0
30 Apr 2024
A Generalized Acquisition Function for Preference-based Reward Learning
A Generalized Acquisition Function for Preference-based Reward Learning
Evan Ellis
Gaurav R. Ghosal
Stuart J. Russell
Anca Dragan
Erdem Biyik
39
1
0
09 Mar 2024
Bayesian Constraint Inference from User Demonstrations Based on
  Margin-Respecting Preference Models
Bayesian Constraint Inference from User Demonstrations Based on Margin-Respecting Preference Models
Dimitris Papadimitriou
Daniel S. Brown
45
1
0
04 Mar 2024
A Model-Based Approach for Improving Reinforcement Learning Efficiency
  Leveraging Expert Observations
A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations
E. C. Ozcan
Vittorio Giammarino
James Queeney
I. Paschalidis
OffRL
36
0
0
29 Feb 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional
  Preference Alignment with Multi-Objective Rewards
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
40
71
0
28 Feb 2024
Iterative Data Smoothing: Mitigating Reward Overfitting and
  Overoptimization in RLHF
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu
Michael I. Jordan
Jiantao Jiao
31
25
0
29 Jan 2024
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
David Chhan
Ellen R. Novoseller
Vernon J. Lawhern
29
5
0
17 Jan 2024
Aligning Human Intent from Imperfect Demonstrations with
  Confidence-based Inverse soft-Q Learning
Aligning Human Intent from Imperfect Demonstrations with Confidence-based Inverse soft-Q Learning
Xizhou Bu
Wenjuan Li
Zhengxiong Liu
Zhiqiang Ma
Panfeng Huang
20
1
0
18 Dec 2023
A density estimation perspective on learning from pairwise human
  preferences
A density estimation perspective on learning from pairwise human preferences
Vincent Dumoulin
Daniel D. Johnson
Pablo Samuel Castro
Hugo Larochelle
Yann Dauphin
29
12
0
23 Nov 2023
Learning to Discern: Imitating Heterogeneous Human Demonstrations with
  Preference and Representation Learning
Learning to Discern: Imitating Heterogeneous Human Demonstrations with Preference and Representation Learning
Sachit Kuhar
Shuo Cheng
Shivang Chopra
Matthew Bronars
Danfei Xu
43
9
0
22 Oct 2023
Learning Reward for Physical Skills using Large Language Model
Learning Reward for Physical Skills using Large Language Model
Yuwei Zeng
Yiqing Xu
30
6
0
21 Oct 2023
Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement
  Learning with Sub-optimal Demonstrations
Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations
Lu Li
Yuxin Pan
Ruobing Chen
Jie Liu
Zilin Wang
Yu Liu
Zhiheng Li
50
0
0
13 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is
  needed? An RL Perspective on RLHF, Prompting, and Beyond
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
34
21
0
09 Oct 2023
Rating-based Reinforcement Learning
Rating-based Reinforcement Learning
Devin White
Mingkang Wu
Ellen R. Novoseller
Vernon J. Lawhern
Nicholas R. Waytowich
Yongcan Cao
ALM
19
6
0
30 Jul 2023
On the Expressivity of Multidimensional Markov Reward
On the Expressivity of Multidimensional Markov Reward
Shuwa Miura
18
4
0
22 Jul 2023
Programmatic Imitation Learning from Unlabeled and Noisy Demonstrations
Programmatic Imitation Learning from Unlabeled and Noisy Demonstrations
Jimmy Xin
Linus Zheng
Kia Rahmani
Jiayi Wei
Jarrett Holtz
Işıl Dillig
Joydeep Biswas
30
1
0
02 Mar 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
23
8
0
13 Feb 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
  $K$-wise Comparisons
Principled Reinforcement Learning with Human Feedback from Pairwise or KKK-wise Comparisons
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
28
181
0
26 Jan 2023
On The Fragility of Learned Reward Functions
On The Fragility of Learned Reward Functions
Lev McKinney
Yawen Duan
David M. Krueger
Adam Gleave
30
20
0
09 Jan 2023
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Daniel Shin
Anca Dragan
Daniel S. Brown
OffRL
14
53
0
03 Jan 2023
Explaining Imitation Learning through Frames
Explaining Imitation Learning through Frames
Boyuan Zheng
Jianlong Zhou
Chun-Hao Liu
Yiqiao Li
Fang Chen
14
0
0
03 Jan 2023
SIRL: Similarity-based Implicit Representation Learning
SIRL: Similarity-based Implicit Representation Learning
Andreea Bobu
Yi Liu
Rohin Shah
Daniel S. Brown
Anca Dragan
SSL
DRL
35
17
0
02 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
Reinforcement learning with Demonstrations from Mismatched Task under
  Sparse Reward
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
21
4
0
03 Dec 2022
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
David Zhang
Micah Carroll
Andreea Bobu
Anca Dragan
22
4
0
30 Nov 2022
Understanding Acoustic Patterns of Human Teachers Demonstrating
  Manipulation Tasks to Robots
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots
Akanksha Saran
K. Desai
M. L. Chang
Rudolf Lioutikov
A. Thomaz
S. Niekum
17
3
0
01 Nov 2022
D-Shape: Demonstration-Shaped Reinforcement Learning via Goal
  Conditioning
D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning
Caroline Wang
Garrett A. Warnell
Peter Stone
40
3
0
26 Oct 2022
Robust Offline Reinforcement Learning with Gradient Penalty and
  Constraint Relaxation
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Chengqian Gao
Kelvin Xu
Liu Liu
Deheng Ye
P. Zhao
Zhiqiang Xu
OffRL
39
2
0
19 Oct 2022
Extraneousness-Aware Imitation Learning
Extraneousness-Aware Imitation Learning
Rachel Zheng
Kaizhe Hu
Zhecheng Yuan
Boyuan Chen
Huazhe Xu
SSL
27
3
0
04 Oct 2022
Bayesian Q-learning With Imperfect Expert Demonstrations
Bayesian Q-learning With Imperfect Expert Demonstrations
Fengdi Che
Xiru Zhu
Doina Precup
D. Meger
Gregory Dudek
19
2
0
01 Oct 2022
Calculus on MDPs: Potential Shaping as a Gradient
Calculus on MDPs: Potential Shaping as a Gradient
Erik Jenner
H. V. Hoof
Adam Gleave
22
4
0
20 Aug 2022
Discriminator-Weighted Offline Imitation Learning from Suboptimal
  Demonstrations
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
Haoran Xu
Xianyuan Zhan
Honglei Yin
Huiling Qin
OffRL
26
66
0
20 Jul 2022
Energy-based Legged Robots Terrain Traversability Modeling via Deep
  Inverse Reinforcement Learning
Energy-based Legged Robots Terrain Traversability Modeling via Deep Inverse Reinforcement Learning
Lu Gan
J. Grizzle
Ryan Eustice
Maani Ghaffari
32
27
0
07 Jul 2022
Auto-Encoding Adversarial Imitation Learning
Auto-Encoding Adversarial Imitation Learning
Kaifeng Zhang
Rui Zhao
Ziming Zhang
Yang Gao
19
1
0
22 Jun 2022
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Contrastive Learning as Goal-Conditioned Reinforcement Learning
Benjamin Eysenbach
Tianjun Zhang
Ruslan Salakhutdinov
Sergey Levine
SSL
OffRL
25
139
0
15 Jun 2022
Model-based Offline Imitation Learning with Non-expert Data
Model-based Offline Imitation Learning with Non-expert Data
Jeongwon Park
Lin F. Yang
OffRL
32
1
0
11 Jun 2022
Receding Horizon Inverse Reinforcement Learning
Receding Horizon Inverse Reinforcement Learning
Yiqing Xu
Wei Gao
David Hsu
24
14
0
09 Jun 2022
Aligning Robot Representations with Humans
Aligning Robot Representations with Humans
Andreea Bobu
Andi Peng
27
0
0
15 May 2022
Adversarial Training for High-Stakes Reliability
Adversarial Training for High-Stakes Reliability
Daniel M. Ziegler
Seraphina Nix
Lawrence Chan
Tim Bauman
Peter Schmidt-Nielsen
...
Noa Nabeshima
Benjamin Weinstein-Raun
D. Haas
Buck Shlegeris
Nate Thomas
AAML
32
59
0
03 May 2022
Can Foundation Models Perform Zero-Shot Task Specification For Robot
  Manipulation?
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Yuchen Cui
S. Niekum
Abhi Gupta
Vikash Kumar
Aravind Rajeswaran
LM&Ro
24
74
0
23 Apr 2022
12
Next