ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.09055
  4. Cited By
Hyperparameter Selection for Offline Reinforcement Learning

Hyperparameter Selection for Offline Reinforcement Learning

17 July 2020
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
    GPOffRL
ArXiv (abs)PDFHTML

Papers citing "Hyperparameter Selection for Offline Reinforcement Learning"

50 / 104 papers shown
Towards an Adaptive Social Game-Playing Robot: An Offline Reinforcement Learning-Based Framework
Towards an Adaptive Social Game-Playing Robot: An Offline Reinforcement Learning-Based Framework
Soon Jynn Chu
Raju Gottumukkala
Alan Barhorst
OffRL
172
0
0
21 Sep 2025
SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning
SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning
Mattie Fellows
Clarisse Wibault
Uljad Berdica
Johannes Forkel
Jakob Foerster
Michael A. Osborne
OffRLOnRL
362
0
0
28 May 2025
A Clean Slate for Offline Reinforcement Learning
A Clean Slate for Offline Reinforcement Learning
Matthew Jackson
Uljad Berdica
Jarek Liesen
Shimon Whiteson
Jakob Foerster
OffRLOnRL
497
3
0
15 Apr 2025
Hyperparameter Optimisation with Practical Interpretability and Explanation Methods in Probabilistic Curriculum Learning
Hyperparameter Optimisation with Practical Interpretability and Explanation Methods in Probabilistic Curriculum Learning
Llewyn Salt
Marcus Gallagher
309
0
0
09 Apr 2025
RARE: Retrieval-Augmented Reasoning Modeling
RARE: Retrieval-Augmented Reasoning Modeling
Zhengren Wang
Jiayang Yu
Dongsheng Ma
Zhe Chen
Yu Wang
...
Feiyu Xiong
Yanfeng Wang
Weinan E
Linpeng Tang
Feiyu Xiong
RALMLRM
469
7
0
30 Mar 2025
Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems
Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems
Rakesh Nadig
Vamanan Arulchelvan
Rahul Bera
Taha Shahroodi
Gagandeep Singh
Mohammad Sadrosadati
Mohammad Sadrosadati
O. Mutlu
Onur Mutlu
443
3
0
26 Mar 2025
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to doInternational Conference on Learning Representations (ICLR), 2025
Yoav Wald
M. Goldstein
Yonathan Efroni
Wouter A. C. van Amsterdam
Rajesh Ranganath
CML
405
0
0
20 Mar 2025
Off-Policy Selection for Initiating Human-Centric Experimental Design
Off-Policy Selection for Initiating Human-Centric Experimental DesignNeural Information Processing Systems (NeurIPS), 2024
Ge Gao
Xi Yang
Qitong Gao
Song Ju
Miroslav Pajic
Min Chi
OffRL
345
0
0
26 Oct 2024
AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design
AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent DesignInternational Conference on Agents and Artificial Intelligence (ICAART), 2024
Francisco Erivaldo Fernandes Junior
Antti Oulasvirta
1.2K
1
0
25 Oct 2024
Experimental evaluation of offline reinforcement learning for HVAC
  control in buildings
Experimental evaluation of offline reinforcement learning for HVAC control in buildings
Jun Wang
Linyan Li
Qi Liu
Yu Yang
OffRLAI4CE
214
2
0
15 Aug 2024
On the consistency of hyper-parameter selection in value-based deep
  reinforcement learning
On the consistency of hyper-parameter selection in value-based deep reinforcement learning
J. Obando-Ceron
J. G. Araújo
Rameswar Panda
Pablo Samuel Castro
450
20
0
25 Jun 2024
Bridging Model-Based Optimization and Generative Modeling via
  Conservative Fine-Tuning of Diffusion Models
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
Masatoshi Uehara
Yulai Zhao
Ehsan Hajiramezanali
Gabriele Scalia
Gökçen Eraslan
Avantika Lal
Sergey Levine
Tommaso Biancalani
463
28
0
30 May 2024
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning
  and How to Deal with It
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
Yuta Saito
Masahiro Nomura
OffRL
354
5
0
23 Apr 2024
Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human
  Demonstrations
Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations
Xiaogang Jia
Denis Blessing
Xinkai Jiang
Moritz Reuss
Atalay Donat
Rudolf Lioutikov
Gerhard Neumann
320
48
0
22 Feb 2024
Deep autoregressive density nets vs neural ensembles for model-based
  offline reinforcement learning
Deep autoregressive density nets vs neural ensembles for model-based offline reinforcement learning
Khyati Khandelwal
Albert Thomas
Jun Yao
OffRL
254
2
0
05 Feb 2024
Adversarially Trained Actor Critic for offline CMDPs
Adversarially Trained Actor Critic for offline CMDPsNeural Information Processing Systems (NeurIPS), 2024
Honghao Wei
Xiyue Peng
Xin Liu
Arnob Ghosh
AAMLOffRL
146
0
0
01 Jan 2024
When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
Vincent Liu
P. Nagarajan
Andrew Patterson
Martha White
OffRL
437
3
0
04 Dec 2023
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy
  Evaluation
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy EvaluationInternational Conference on Learning Representations (ICLR), 2023
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
523
15
0
30 Nov 2023
SCOPE-RL: A Python Library for Offline Reinforcement Learning and
  Off-Policy Evaluation
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRLELM
548
5
0
30 Nov 2023
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with
  Multi-Step On-Policy Optimization
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy OptimizationInternational Conference on Learning Representations (ICLR), 2023
Kun Lei
Zhengmao He
Chenhao Lu
Kaizhe Hu
Yang Gao
Huazhe Xu
OffRLOnRL
433
29
0
06 Nov 2023
State-Action Similarity-Based Representations for Off-Policy Evaluation
State-Action Similarity-Based Representations for Off-Policy EvaluationNeural Information Processing Systems (NeurIPS), 2023
Brahma S. Pavse
Josiah P. Hanna
OffRL
316
4
0
27 Oct 2023
Robustness of Algorithms for Causal Structure Learning to Hyperparameter
  Choice
Robustness of Algorithms for Causal Structure Learning to Hyperparameter ChoiceCLEaR (CLEaR), 2023
Damian Machlanski
Spyridon Samothrakis
Paul Clarke
CML
308
4
0
27 Oct 2023
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy
  Evaluation
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy EvaluationNeural Information Processing Systems (NeurIPS), 2023
Shengpu Tang
Jenna Wiens
OffRLCML
304
6
0
26 Oct 2023
ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning
ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement LearningNetwork and Distributed System Security Symposium (NDSS), 2023
L. Du
Min Chen
Mingyang Sun
Shouling Ji
Peng Cheng
Jiming Chen
Zhikun Zhang
OffRL
361
13
0
06 Sep 2023
Active Policy Improvement from Multiple Black-box Oracles
Active Policy Improvement from Multiple Black-box OraclesInternational Conference on Machine Learning (ICML), 2023
Xuefeng Liu
Takuma Yoneda
Simon Mahns
Matthew R. Walter
Yuxin Chen
451
13
0
17 Jun 2023
$\pi2\text{vec}$: Policy Representations with Successor Features
π2vec\pi2\text{vec}π2vec: Policy Representations with Successor FeaturesInternational Conference on Learning Representations (ICLR), 2023
Gianluca Scarpellini
Ksenia Konyushkova
Claudio Fantacci
T. Paine
Yutian Chen
Misha Denil
OffRL
282
1
0
16 Jun 2023
Stepsize Learning for Policy Gradient Methods in Contextual Markov
  Decision Processes
Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes
Luca Sabbioni
Francesco Corda
Marcello Restelli
221
0
0
13 Jun 2023
Explaining RL Decisions with Trajectories
Explaining RL Decisions with TrajectoriesInternational Conference on Learning Representations (ICLR), 2023
Shripad Deshmukh
Arpan Dasgupta
Balaji Krishnamurthy
Nan Jiang
Chirag Agarwal
Georgios Theocharous
J. Subramanian
OffRL
265
9
0
06 May 2023
A Survey of Demonstration Learning
A Survey of Demonstration Learning
André Rosa de Sousa Porfírio Correia
Luís A. Alexandre
OffRL
279
36
0
20 Mar 2023
Scalable End-to-End ML Platforms: from AutoML to Self-serve
Scalable End-to-End ML Platforms: from AutoML to Self-serve
I. Markov
P. Apostolopoulos
Mia Garrard
Tianyu Qie
Yin Huang
...
Anika Li
Cesar Cardoso
George Han
Ryan Maghsoudian
Norm Zhou
LRM
457
6
0
27 Feb 2023
Behavior Proximal Policy Optimization
Behavior Proximal Policy OptimizationInternational Conference on Learning Representations (ICLR), 2023
Zifeng Zhuang
Kun Lei
Jinxin Liu
Xuetao Zhang
Yilang Guo
OffRL
367
51
0
22 Feb 2023
Machine Learning Systems: A Survey from a Data-Oriented Perspective
Machine Learning Systems: A Survey from a Data-Oriented PerspectiveACM Computing Surveys (ACM Comput. Surv.), 2023
Christian Cabrera
Andrei Paleyes
Pierre Thodoroff
Neil D. Lawrence
OODAI4TSAI4CE
340
7
0
09 Feb 2023
A Strong Baseline for Batch Imitation Learning
A Strong Baseline for Batch Imitation Learning
Matthew Smith
Lucas Maystre
Zhenwen Dai
K. Ciosek
OffRL
185
5
0
06 Feb 2023
Revisiting Bellman Errors for Offline Model Selection
Revisiting Bellman Errors for Offline Model SelectionInternational Conference on Machine Learning (ICML), 2023
Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok University of North Carolina at Chapel Hill
OffRL
345
6
0
31 Jan 2023
Model-based Offline Reinforcement Learning with Local Misspecification
Model-based Offline Reinforcement Learning with Local MisspecificationAAAI Conference on Artificial Intelligence (AAAI), 2023
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
265
6
0
26 Jan 2023
Scaling Marginalized Importance Sampling to High-Dimensional
  State-Spaces via State Abstraction
Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State AbstractionAAAI Conference on Artificial Intelligence (AAAI), 2022
Brahma S. Pavse
Josiah P. Hanna
OffRL
228
9
0
14 Dec 2022
Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce
  Order Fraud Evaluation
Benchmarking Offline Reinforcement Learning Algorithms for E-Commerce Order Fraud Evaluation
Soysal Degirmenci
Chris Jones
OffRL
161
1
0
05 Dec 2022
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Policy-Adaptive Estimator Selection for Off-Policy EvaluationAAAI Conference on Artificial Intelligence (AAAI), 2022
Takuma Udagawa
Haruka Kiyohara
Yusuke Narita
Yuta Saito
Keisuke Tateno
OffRL
300
29
0
25 Nov 2022
Oracle Inequalities for Model Selection in Offline Reinforcement
  Learning
Oracle Inequalities for Model Selection in Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
Emma Brunskill
OffRL
392
14
0
03 Nov 2022
Beyond the Return: Off-policy Function Estimation under User-specified
  Error-measuring Distributions
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring DistributionsNeural Information Processing Systems (NeurIPS), 2022
Audrey Huang
Nan Jiang
OffRL
218
9
0
27 Oct 2022
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited
  Data
Data-Efficient Pipeline for Offline Reinforcement Learning with Limited DataNeural Information Processing Systems (NeurIPS), 2022
Allen Nie
Yannis Flet-Berliac
Deon R. Jordan
William Steenbergen
Emma Brunskill
OffRL
353
14
0
16 Oct 2022
AnalogVNN: A fully modular framework for modeling and optimizing
  photonic neural networks
AnalogVNN: A fully modular framework for modeling and optimizing photonic neural networksAPL Machine Learning (AML), 2022
Vivswan Shah
Nathan Youngblood
258
7
0
14 Oct 2022
Conservative Bayesian Model-Based Value Expansion for Offline Policy
  Optimization
Conservative Bayesian Model-Based Value Expansion for Offline Policy OptimizationInternational Conference on Learning Representations (ICLR), 2022
Jihwan Jeong
Xiaoyu Wang
Michael Gimelfarb
Hyunwoo J. Kim
Baher Abdulhai
Scott Sanner
OffRL
231
15
0
07 Oct 2022
Hierarchical reinforcement learning for in-hand robotic manipulation
  using Davenport chained rotations
Hierarchical reinforcement learning for in-hand robotic manipulation using Davenport chained rotationsInternational Conference on Automation, Robotics and Applications (ICARA), 2022
Francisco Roldan Sanchez
Qiang-qiang Wang
David Córdova Bulens
Kevin McGuinness
Stephen J. Redmond
Noel E. O'Connor
176
1
0
03 Oct 2022
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical
  Multi-Step Approach for Policy Training
Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy TrainingInternational Joint Conference on Artificial Intelligence (IJCAI), 2022
Gang Chen
Victoria Huang
OffRL
341
3
0
29 Sep 2022
Q-learning Decision Transformer: Leveraging Dynamic Programming for
  Conditional Sequence Modelling in Offline RL
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RLInternational Conference on Machine Learning (ICML), 2022
Taku Yamagata
Ahmed Khalil
Raúl Santos-Rodríguez
OffRL
690
121
0
08 Sep 2022
Discriminator-Weighted Offline Imitation Learning from Suboptimal
  Demonstrations
Discriminator-Weighted Offline Imitation Learning from Suboptimal DemonstrationsInternational Conference on Machine Learning (ICML), 2022
Haoran Xu
Xianyuan Zhan
Honglei Yin
Huiling Qin
OffRL
337
104
0
20 Jul 2022
An Empirical Study of Implicit Regularization in Deep Offline RL
An Empirical Study of Implicit Regularization in Deep Offline RL
Çağlar Gülçehre
Srivatsan Srinivasan
Jakub Sygnowski
Georg Ostrovski
Mehrdad Farajtabar
Matt Hoffman
Razvan Pascanu
Arnaud Doucet
OffRL
367
23
0
05 Jul 2022
Incorporating Explicit Uncertainty Estimates into Deep Offline
  Reinforcement Learning
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
David Brandfonbrener
Rémi Tachet des Combes
Romain Laroche
OffRL
269
5
0
02 Jun 2022
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Anurag Koul
Mariano Phielipp
Alan Fern
OffRL
293
0
0
22 May 2022
123
Next
Page 1 of 3