ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.12849
  4. Cited By
Provably Efficient Q-Learning with Low Switching Cost
v1v2v3 (latest)

Provably Efficient Q-Learning with Low Switching Cost

Neural Information Processing Systems (NeurIPS), 2019
30 May 2019
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
ArXiv (abs)PDFHTML

Papers citing "Provably Efficient Q-Learning with Low Switching Cost"

50 / 77 papers shown
The Adaptivity Barrier in Batched Nonparametric Bandits: Sharp Characterization of the Price of Unknown Margin
The Adaptivity Barrier in Batched Nonparametric Bandits: Sharp Characterization of the Price of Unknown Margin
Rong Jiang
Cong Ma
210
0
0
05 Nov 2025
Q-learning with Posterior Sampling
Q-learning with Posterior Sampling
Priyank Agrawal
Shipra Agrawal
Azmat Azati
OffRLGP
367
2
0
01 Jun 2025
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function ApproximationConference on Uncertainty in Artificial Intelligence (UAI), 2025
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
291
1
0
20 May 2025
Human Machine Co-Adaptation Model and Its Convergence Analysis
Human Machine Co-Adaptation Model and Its Convergence Analysis
Steven W. Su
Yaqi Li
Kairui Guo
Rob Duffield
276
0
0
10 Mar 2025
Near-Optimal Reinforcement Learning with Shuffle Differential Privacy
Shaojie Bai
Mohammad Sadegh Talebi
Chengcheng Zhao
Peng Cheng
Jiming Chen
OffRL
517
0
0
18 Nov 2024
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRLOnRL
309
6
0
06 Nov 2024
Federated UCBVI: Communication-Efficient Federated Regret Minimization
  with Heterogeneous Agents
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous AgentsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Safwan Labbi
D. Tiapkin
Lorenzo Mancini
Paul Mangold
Eric Moulines
FedML
322
5
0
30 Oct 2024
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage DecompositionInternational Conference on Learning Representations (ICLR), 2024
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
447
9
0
10 Oct 2024
Upper and Lower Bounds for Distributionally Robust Off-Dynamics
  Reinforcement Learning
Upper and Lower Bounds for Distributionally Robust Off-Dynamics Reinforcement Learning
Zhishuai Liu
Weixin Wang
Pan Xu
412
13
0
30 Sep 2024
State-free Reinforcement Learning
State-free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
368
0
0
27 Sep 2024
Human-Machine Co-Adaptation for Robot-Assisted Rehabilitation via
  Dual-Agent Multiple Model Reinforcement Learning (DAMMRL)
Human-Machine Co-Adaptation for Robot-Assisted Rehabilitation via Dual-Agent Multiple Model Reinforcement Learning (DAMMRL)
Yang An
Yaqi Li
Hongwei Wang
Rob Duffield
Steven W. Su
AI4CE
217
2
0
31 Jul 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
417
1
0
01 Jul 2024
Test-Time Regret Minimization in Meta Reinforcement Learning
Test-Time Regret Minimization in Meta Reinforcement Learning
Mirco Mutti
Aviv Tamar
335
4
0
04 Jun 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with
  General Function Approximation
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
Jianliang He
Han Zhong
Zhuoran Yang
355
6
0
19 Apr 2024
Batched Nonparametric Contextual Bandits
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
541
4
0
27 Feb 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy
  Coverage Suffices
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
301
9
0
08 Feb 2024
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity
  Constraints
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Dan Qiao
Yu Wang
OffRL
332
5
0
02 Feb 2024
Constant Stepsize Q-learning: Distributional Convergence, Bias and
  Extrapolation
Constant Stepsize Q-learning: Distributional Convergence, Bias and Extrapolation
Yixuan Zhang
Qiaomin Xie
349
14
0
25 Jan 2024
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Meshal Alharbi
Mardavij Roozbehani
M. Dahleh
337
4
0
19 Dec 2023
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for
  Dimension-Dependent Adaptivity
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent AdaptivityInternational Conference on Learning Representations (ICLR), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
OffRL
382
2
0
02 Oct 2023
Minimax Optimal Q Learning with Nearest Neighbors
Minimax Optimal Q Learning with Nearest NeighborsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Puning Zhao
Lifeng Lai
OffRL
324
16
0
03 Aug 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
884
42
0
25 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline DataNeural Information Processing Systems (NeurIPS), 2023
Ruiqi Zhang
Andrea Zanette
OffRLOnRL
343
11
0
10 Jul 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity
  Sampling
Low-Switching Policy Gradient with Exploration via Online Sensitivity SamplingInternational Conference on Machine Learning (ICML), 2023
Yunfan Li
Yiran Wang
Y. Cheng
Lin F. Yang
OffRL
270
6
0
15 Jun 2023
Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information
Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information
Ming Shi
Yingbin Liang
Ness B. Shroff
411
3
0
14 Jun 2023
The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model
The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative ModelNeural Information Processing Systems (NeurIPS), 2023
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Matthieu Geist
Yuejie Chi
OOD
501
54
0
26 May 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In TimeNeural Information Processing Systems (NeurIPS), 2023
Xiang Ji
Gen Li
OffRL
434
9
0
24 May 2023
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup
  and Beyond
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and BeyondInternational Conference on Machine Learning (ICML), 2023
Jiin Woo
Gauri Joshi
Yuejie Chi
FedML
402
34
0
18 May 2023
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid
  Reinforcement Learning
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Gen Li
Wenhao Zhan
Jason D. Lee
Yuejie Chi
Yuxin Chen
OffRLOnRL
339
18
0
17 May 2023
Human Machine Co-adaption Interface via Cooperation Markov Decision
  Process System
Human Machine Co-adaption Interface via Cooperation Markov Decision Process System
Kairui Guo
Adrian Cheng
Yaqing Li
Jun Li
Rob Duffield
Steven W. Su
156
0
0
03 May 2023
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Gen Li
Yuling Yan
Yuxin Chen
Jianqing Fan
OffRL
372
16
0
14 Apr 2023
Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs
Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs
Dan Qiao
Ming Yin
Yu Wang
227
8
0
24 Feb 2023
Near-Optimal Adversarial Reinforcement Learning with Switching Costs
Near-Optimal Adversarial Reinforcement Learning with Switching CostsInternational Conference on Learning Representations (ICLR), 2023
Ming Shi
Yitao Liang
Ness B. Shroff
180
4
0
08 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic EnvironmentsInternational Conference on Machine Learning (ICML), 2023
Runlong Zhou
Zihan Zhang
S. Du
406
19
0
31 Jan 2023
Communication-Efficient Collaborative Regret Minimization in Multi-Armed
  Bandits
Communication-Efficient Collaborative Regret Minimization in Multi-Armed BanditsAAAI Conference on Artificial Intelligence (AAAI), 2023
Nikolai Karpov
Qin Zhang
423
2
0
26 Jan 2023
Provable Sim-to-real Transfer in Continuous Domain with Partial
  Observations
Provable Sim-to-real Transfer in Continuous Domain with Partial ObservationsInternational Conference on Learning Representations (ICLR), 2022
Jiachen Hu
Han Zhong
Chi Jin
Liwei Wang
364
10
0
27 Oct 2022
Unpacking Reward Shaping: Understanding the Benefits of Reward
  Engineering on Sample Complexity
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample ComplexityNeural Information Processing Systems (NeurIPS), 2022
Abhishek Gupta
Aldo Pacchiano
Yuexiang Zhai
Sham Kakade
Sergey Levine
OffRL
267
100
0
18 Oct 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Near-Optimal Regret Bounds for Multi-batch Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
260
14
0
15 Oct 2022
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning
  with Linear Function Approximation
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function ApproximationInternational Conference on Learning Representations (ICLR), 2022
Dan Qiao
Yu Wang
OffRL
345
15
0
03 Oct 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Byzantine-Robust Online and Offline Distributed Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yiding Chen
Xuezhou Zhang
Jianchao Tan
Mengdi Wang
Xiaojin Zhu
OffRL
404
22
0
01 Jun 2022
One Policy is Enough: Parallel Exploration with a Single Policy is
  Near-Optimal for Reward-Free Reinforcement Learning
One Policy is Enough: Parallel Exploration with a Single Policy is Near-Optimal for Reward-Free Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Pedro Cisneros-Velarde
Boxiang Lyu
Oluwasanmi Koyejo
Mladen Kolar
OffRL
468
3
0
31 May 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-LearningIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2022
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
393
45
0
14 Mar 2022
Learn to Match with No Regret: Reinforcement Learning in Markov Matching
  Markets
Learn to Match with No Regret: Reinforcement Learning in Markov Matching MarketsNeural Information Processing Systems (NeurIPS), 2022
Yifei Min
Tianhao Wang
Ruitu Xu
Zhaoran Wang
Sai Li
Zhuoran Yang
272
30
0
07 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample ComplexityInternational Conference on Machine Learning (ICML), 2022
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
382
107
0
28 Feb 2022
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
  Optimality
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and OptimalityInternational Conference on Learning Representations (ICLR), 2022
Jiawei Huang
Jinglin Chen
Li Zhao
Tao Qin
Nan Jiang
Tie-Yan Liu
OffRL
355
31
0
14 Feb 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Sample-Efficient Reinforcement Learning with loglog(T) Switching CostInternational Conference on Machine Learning (ICML), 2022
Dan Qiao
Ming Yin
Ming Min
Yu Wang
302
35
0
13 Feb 2022
Improved Regret for Differentially Private Exploration in Linear MDP
Improved Regret for Differentially Private Exploration in Linear MDPInternational Conference on Machine Learning (ICML), 2022
Dung Daniel Ngo
G. Vietri
Zhiwei Steven Wu
306
8
0
02 Feb 2022
A Benchmark for Low-Switching-Cost Reinforcement Learning
A Benchmark for Low-Switching-Cost Reinforcement Learning
Shusheng Xu
Yancheng Liang
Yunfei Li
S. Du
Yi Wu
OffRL
185
0
0
13 Dec 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
344
88
0
17 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
402
66
0
09 Oct 2021
12
Next
Page 1 of 2