ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.13234
  4. Cited By
Safe Exploration for Efficient Policy Evaluation and Comparison

Safe Exploration for Efficient Policy Evaluation and Comparison

26 February 2022
Runzhe Wan
B. Kveton
Rui Song
    OffRL
ArXivPDFHTML

Papers citing "Safe Exploration for Efficient Policy Evaluation and Comparison"

8 / 8 papers shown
Title
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in
  Tabular MDP
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Subhojyoti Mukherjee
Josiah P. Hanna
Robert Nowak
OffRL
48
0
0
04 Jun 2024
Experiment Planning with Function Approximation
Experiment Planning with Function Approximation
Aldo Pacchiano
Jonathan Lee
Emma Brunskill
OffRL
26
3
0
10 Jan 2024
When is Off-Policy Evaluation Useful? A Data-Centric Perspective
When is Off-Policy Evaluation Useful? A Data-Centric Perspective
Hao Sun
Alex J. Chan
Nabeel Seedat
Alihan Huyuk
M. Schaar
ELM
OffRL
21
1
0
23 Nov 2023
On-Policy Policy Gradient Reinforcement Learning Without On-Policy
  Sampling
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
Nicholas Corrado
Josiah P. Hanna
OffRL
18
1
0
14 Nov 2023
Directional Optimism for Safe Linear Bandits
Directional Optimism for Safe Linear Bandits
Spencer Hutchinson
Berkay Turan
M. Alizadeh
8
8
0
29 Aug 2023
Pure Exploration in Bandits with Linear Constraints
Pure Exploration in Bandits with Linear Constraints
Emil Carlsson
Debabrota Basu
Fredrik D. Johansson
Devdatt Dubhashi
34
2
0
22 Jun 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Botao Hao
Rahul Jain
Dengwang Tang
Zheng Wen
OffRL
26
3
0
20 Mar 2023
SPEED: Experimental Design for Policy Evaluation in Linear
  Heteroscedastic Bandits
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits
Subhojyoti Mukherjee
Qiaomin Xie
Josiah P. Hanna
R. Nowak
OffRL
47
5
0
29 Jan 2023
1