ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1903.08738
  4. Cited By
Batch Policy Learning under Constraints

Batch Policy Learning under Constraints

20 March 2019
Hoang Minh Le
Cameron Voloshin
Yisong Yue
    OffRL
ArXivPDFHTML

Papers citing "Batch Policy Learning under Constraints"

30 / 30 papers shown
Title
Offline Constrained Reinforcement Learning under Partial Data Coverage
Kihyuk Hong
Ambuj Tewari
OffRL
84
0
0
23 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
195
2
0
22 Feb 2025
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Zijian Guo
Weichao Zhou
Wenchao Li
OffRL
131
2
0
28 Jan 2025
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing
Jitao Wang
C. Shi
John D. Piette
Joshua R. Loftus
Donglin Zeng
Zhenke Wu
OffRL
112
0
0
10 Jan 2025
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen
Shuze Liu
Shangtong Zhang
OffRL
301
1
0
08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
137
3
0
03 Oct 2024
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Toshinori Kitamura
Tadashi Kozuno
Wataru Kumagai
Kenta Hoshino
Y. Hosoe
Kazumi Kasaura
Masashi Hamaya
Paavo Parmas
Yutaka Matsuo
89
2
0
29 Aug 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
105
0
0
01 Jul 2024
LTL-Constrained Policy Optimization with Cycle Experience Replay
LTL-Constrained Policy Optimization with Cycle Experience Replay
Ameesh Shah
Cameron Voloshin
Chenxi Yang
Abhinav Verma
Swarat Chaudhuri
Sanjit A. Seshia
83
1
0
17 Apr 2024
Control Regularization for Reduced Variance Reinforcement Learning
Control Regularization for Reduced Variance Reinforcement Learning
Richard Cheng
Abhinav Verma
G. Orosz
Swarat Chaudhuri
Yisong Yue
J. W. Burdick
OffRL
56
77
0
14 May 2019
Model-Predictive Policy Learning with Uncertainty Regularization for
  Driving in Dense Traffic
Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
Mikael Henaff
A. Canziani
Yann LeCun
OOD
74
122
0
08 Jan 2019
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
110
354
0
29 Oct 2018
Self-Imitation Learning
Self-Imitation Learning
Junhyuk Oh
Yijie Guo
Satinder Singh
Honglak Lee
SSL
51
249
0
14 Jun 2018
Accelerating Imitation Learning with Predictive Models
Accelerating Imitation Learning with Predictive Models
Ching-An Cheng
Xinyan Yan
Evangelos A. Theodorou
Byron Boots
51
21
0
12 Jun 2018
A Reductions Approach to Fair Classification
A Reductions Approach to Fair Classification
Alekh Agarwal
A. Beygelzimer
Miroslav Dudík
John Langford
Hanna M. Wallach
FaML
154
1,094
0
06 Mar 2018
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
58
267
0
10 Feb 2018
Safe Exploration in Continuous Action Spaces
Safe Exploration in Continuous Action Spaces
Gal Dalal
Krishnamurthy Dvijotham
Matej Vecerík
Todd Hester
Cosmin Paduraru
Yuval Tassa
46
438
0
26 Jan 2018
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
101
1,313
0
30 May 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
77
1,329
0
27 Feb 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
72
220
0
04 Dec 2016
Guided Policy Search as Approximate Mirror Descent
Guided Policy Search as Approximate Mirror Descent
William H. Montgomery
Sergey Levine
65
125
0
15 Jul 2016
Smooth Imitation Learning for Online Sequence Prediction
Smooth Imitation Learning for Online Sequence Prediction
Hoang Minh Le
Andrew Kang
Yisong Yue
Peter Carr
50
33
0
03 Jun 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
276
573
0
04 Apr 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
157
621
0
11 Nov 2015
Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
146
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
237
13,174
0
09 Sep 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
254
6,722
0
19 Feb 2015
Reinforcement and Imitation Learning via Interactive No-Regret Learning
Reinforcement and Imitation Learning via Interactive No-Regret Learning
Stéphane Ross
J. Andrew Bagnell
OffRL
96
262
0
23 Jun 2014
A Survey of Multi-Objective Sequential Decision-Making
A Survey of Multi-Objective Sequential Decision-Making
D. Roijers
Peter Vamplew
Shimon Whiteson
Richard Dazeley
77
648
0
04 Feb 2014
Doubly Robust Policy Evaluation and Learning
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
193
694
0
23 Mar 2011
1