ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.07457
  4. Cited By
When Demonstrations Meet Generative World Models: A Maximum Likelihood
  Framework for Offline Inverse Reinforcement Learning

When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning

15 February 2023
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
    OffRL
ArXivPDFHTML

Papers citing "When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning"

9 / 9 papers shown
Title
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
  Pessimism
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
16
51
0
29 May 2023
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees
Siliang Zeng
Chenliang Li
Alfredo García
Min-Fong Hong
16
42
0
04 Oct 2022
Identifiability and generalizability from multiple experts in Inverse
  Reinforcement Learning
Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning
Paul Rolland
Luca Viano
Norman Schuerhoff
Boris Nikolov
V. Cevher
OffRL
24
13
0
22 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Pessimistic Model-based Offline Reinforcement Learning under Partial
  Coverage
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
91
144
0
13 Jul 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
197
412
0
16 Feb 2021
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
85
145
0
04 May 2020
Deep Reinforcement Learning for Autonomous Driving: A Survey
Deep Reinforcement Learning for Autonomous Driving: A Survey
B. R. Kiran
Ibrahim Sobh
V. Talpaert
Patrick Mannion
A. A. Sallab
S. Yogamani
P. Pérez
143
1,599
0
02 Feb 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
5,635
0
05 Dec 2016
1