ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05247
  4. Cited By
Bootstrapping Upper Confidence Bound
v1v2v3 (latest)

Bootstrapping Upper Confidence Bound

12 June 2019
Botao Hao
Yasin Abbasi-Yadkori
Zheng Wen
Guang Cheng
ArXiv (abs)PDFHTML

Papers citing "Bootstrapping Upper Confidence Bound"

38 / 38 papers shown
Title
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Chengpeng Hu
Ziming Wang
Bo Yuan
Jialin Liu
Chengqi Zhang
Xin Yao
20
0
0
20 Jun 2025
Not All Documents Are What You Need for Extracting Instruction Tuning Data
Not All Documents Are What You Need for Extracting Instruction Tuning Data
Chi Zhang
Huaping Zhong
Hongtao Li
Chengliang Chai
Jiawei Hong
...
Jiantao Qiu
Ye Yuan
Guoren Wang
Zeang Sheng
Lei Cao
SyDa
66
0
0
18 May 2025
Reward-Safety Balance in Offline Safe RL via Diffusion Regularization
Junyu Guo
Zhi Zheng
Donghao Ying
Ming Jin
Shangding Gu
C. Spanos
Javad Lavaei
OffRL
198
0
0
18 Feb 2025
UCB algorithms for multi-armed bandits: Precise regret and adaptive
  inference
UCB algorithms for multi-armed bandits: Precise regret and adaptive inference
Q. Han
K. Khamaru
Cun-Hui Zhang
118
5
0
09 Dec 2024
Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach
Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach
Xiongxiao Xu
Solomon Abera Bekele
Brice Videau
Kai Shu
53
1
0
03 Oct 2024
Robotic Optimization of Powdered Beverages Leveraging Computer Vision and Bayesian Optimization
Robotic Optimization of Powdered Beverages Leveraging Computer Vision and Bayesian Optimization
E. Szymańska
Josie Hughes
73
0
0
17 Sep 2024
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
S. Samsonov
Eric Moulines
Qi-Man Shao
Zhuo-Song Zhang
Alexey Naumov
99
5
0
26 May 2024
Zero-Inflated Bandits
Zero-Inflated Bandits
Haoyu Wei
Runzhe Wan
Lei Shi
Rui Song
108
0
0
25 Dec 2023
Multi-Agent Probabilistic Ensembles with Trajectory Sampling for
  Connected Autonomous Vehicles
Multi-Agent Probabilistic Ensembles with Trajectory Sampling for Connected Autonomous Vehicles
Ruoqi Wen
Jiahao Huang
Rongpeng Li
Guoru Ding
Zhifeng Zhao
81
1
0
21 Dec 2023
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
84
3
0
01 Oct 2023
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics
Feature Normalization Prevents Collapse of Non-contrastive Learning Dynamics
Han Bao
SSLMLT
97
1
0
28 Sep 2023
Did we personalize? Assessing personalization by an online reinforcement
  learning algorithm using resampling
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling
Susobhan Ghosh
Raphael Kim
Prasidh Chhabria
Raaz Dwivedi
Predrag Klasjna
Peng Liao
Kelly Zhang
Susan Murphy
OffRL
72
9
0
11 Apr 2023
Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm
Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm
Huiming Zhang
Haoyu Wei
Guang Cheng
77
1
0
13 Mar 2023
Multiplier Bootstrap-based Exploration
Multiplier Bootstrap-based Exploration
Runzhe Wan
Haoyu Wei
Branislav Kveton
R. Song
52
3
0
03 Feb 2023
Online Statistical Inference for Contextual Bandits via Stochastic
  Gradient Descent
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent
Xinyu Chen
Zehua Lai
He Li
Yichen Zhang
80
4
0
30 Dec 2022
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian
  Control?
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?
Yi Tian
Kai Zhang
Russ Tedrake
S. Sra
85
5
0
30 Dec 2022
A Critical Review of Traffic Signal Control and A Novel Unified View of
  Reinforcement Learning and Model Predictive Control Approaches for Adaptive
  Traffic Signal Control
A Critical Review of Traffic Signal Control and A Novel Unified View of Reinforcement Learning and Model Predictive Control Approaches for Adaptive Traffic Signal Control
Xiaoyu Wang
Scott Sanner
Baher Abdulhai
61
5
0
26 Nov 2022
Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models
Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models
Yuchen Wu
Kangjie Zhou
173
6
0
07 Nov 2022
Misspecified Phase Retrieval with Generative Priors
Misspecified Phase Retrieval with Generative Priors
Zhaoqiang Liu
Xinshao Wang
Jiulong Liu
86
6
0
11 Oct 2022
Robust Tests in Online Decision-Making
Robust Tests in Online Decision-Making
Gi-Soo Kim
Hyun-Joon Yang
J. P. Kim
OffRL
48
0
0
21 Aug 2022
Residual Bootstrap Exploration for Stochastic Linear Bandit
Residual Bootstrap Exploration for Stochastic Linear Bandit
Shuang Wu
ChiHua Wang
Yuantong Li
Guang Cheng
81
8
0
23 Feb 2022
Double Thompson Sampling in Finite stochastic Games
Double Thompson Sampling in Finite stochastic Games
Shuqing Shi
Xiaobin Wang
Zhi-Xuan Yang
Fan Zhang
Hong Qu
17
0
0
21 Feb 2022
Optimal Regret Is Achievable with Bounded Approximate Inference Error:
  An Enhanced Bayesian Upper Confidence Bound Framework
Optimal Regret Is Achievable with Bounded Approximate Inference Error: An Enhanced Bayesian Upper Confidence Bound Framework
Ziyi Huang
Henry Lam
A. Meisami
Haofeng Zhang
109
4
0
31 Jan 2022
Bregman Deviations of Generic Exponential Families
Bregman Deviations of Generic Exponential Families
Sayak Ray Chowdhury
Patrick Saux
Odalric-Ambrym Maillard
Aditya Gopalan
69
12
0
18 Jan 2022
Centroid Approximation for Bootstrap: Improving Particle Quality at
  Inference
Centroid Approximation for Bootstrap: Improving Particle Quality at Inference
Mao Ye
Qiang Liu
47
1
0
17 Oct 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
137
28
0
08 Aug 2021
Debiasing Samples from Online Learning Using Bootstrap
Debiasing Samples from Online Learning Using Bootstrap
Ningyuan Chen
Xuefeng Gao
Yi Xiong
OffRLOnRL
52
4
0
31 Jul 2021
GuideBoot: Guided Bootstrap for Deep Contextual Bandits
GuideBoot: Guided Bootstrap for Deep Contextual Bandits
Feiyang Pan
Haoming Li
Xiang Ao
Wei Wang
Yanrong Kang
Ao Tan
Qing He
38
0
0
18 Jul 2021
Towards Sample-Optimal Compressive Phase Retrieval with Sparse and
  Generative Priors
Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors
Zhaoqiang Liu
Subhro Ghosh
Jonathan Scarlett
61
18
0
29 Jun 2021
Fundamental Limits of Reinforcement Learning in Environment with
  Endogeneous and Exogeneous Uncertainty
Fundamental Limits of Reinforcement Learning in Environment with Endogeneous and Exogeneous Uncertainty
Rongpeng Li
74
0
0
15 Jun 2021
Multi-armed Bandit Requiring Monotone Arm Sequences
Multi-armed Bandit Requiring Monotone Arm Sequences
Ningyuan Chen
133
11
0
07 Jun 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
87
40
0
06 Feb 2021
Sharper Sub-Weibull Concentrations
Sharper Sub-Weibull Concentrations
Huiming Zhang
Haoyu Wei
116
20
0
04 Feb 2021
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy
  Evaluation
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation
Ilya Kostrikov
Ofir Nachum
OffRL
54
31
0
27 Jul 2020
A Unifying Framework for Reinforcement Learning and Planning
A Unifying Framework for Reinforcement Learning and Planning
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
131
9
0
26 Jun 2020
Differentiable Linear Bandit Algorithm
Differentiable Linear Bandit Algorithm
Kaige Yang
Laura Toni
52
6
0
04 Jun 2020
Residual Bootstrap Exploration for Bandit Algorithms
Residual Bootstrap Exploration for Bandit Algorithms
ChiHua Wang
Yang Yu
Botao Hao
Guang Cheng
51
16
0
19 Feb 2020
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential
  properties to heavier-tailed distributions
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential properties to heavier-tailed distributions
M. Vladimirova
Stéphane Girard
Hien Nguyen
Julyan Arbel
101
92
0
13 May 2019
1