ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01229
  4. Cited By
Doubly robust Thompson sampling for linear payoffs
v1v2v3 (latest)

Doubly robust Thompson sampling for linear payoffs

Neural Information Processing Systems (NeurIPS), 2021
1 February 2021
Wonyoung Hedge Kim
Gi-Soo Kim
M. Paik
ArXiv (abs)PDFHTML

Papers citing "Doubly robust Thompson sampling for linear payoffs"

16 / 16 papers shown
Experimental Design for Semiparametric Bandits
Experimental Design for Semiparametric BanditsAnnual Conference Computational Learning Theory (COLT), 2025
Seok-Jin Kim
Gi-Soo Kim
Min-hwan Oh
245
1
0
16 Jun 2025
Linear Bandits with Partially Observable Features
Linear Bandits with Partially Observable Features
Wonyoung Hedge Kim
Sungwoo Park
G. Iyengar
A. Zeevi
Min Hwan Oh
702
3
0
10 Feb 2025
A Quadrature Approach for General-Purpose Batch Bayesian Optimization
  via Probabilistic Lifting
A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting
Masaki Adachi
Satoshi Hayakawa
Martin Jørgensen
Saad Hamid
Harald Oberhauser
Michael A. Osborne
GP
443
3
0
18 Apr 2024
Adaptive Experimental Design for Policy Learning
Adaptive Experimental Design for Policy Learning
Masahiro Kato
Kyohei Okumura
Takuya Ishihara
Toru Kitagawa
OffRL
494
1
0
08 Jan 2024
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions
RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions
Easton K. Huch
Jieru Shi
Madeline R Abbott
J. Golbus
Alexander Moreno
Walter Dempsey
OffRL
476
0
0
11 Dec 2023
A Doubly Robust Approach to Sparse Reinforcement Learning
A Doubly Robust Approach to Sparse Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Wonyoung Hedge Kim
Garud Iyengar
A. Zeevi
245
5
0
23 Oct 2023
Learning the Pareto Front Using Bootstrapped Observation Samples
Learning the Pareto Front Using Bootstrapped Observation SamplesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Wonyoung Hedge Kim
G. Iyengar
A. Zeevi
378
7
0
31 May 2023
Asymptotically Optimal Fixed-Budget Best Arm Identification with
  Variance-Dependent Bounds
Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds
Masahiro Kato
Masaaki Imaizumi
Takuya Ishihara
T. Kitagawa
390
2
0
06 Feb 2023
Improved Algorithms for Multi-period Multi-class Packing Problems with
  Bandit Feedback
Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit FeedbackInternational Conference on Machine Learning (ICML), 2023
Wonyoung Hedge Kim
G. Iyengar
A. Zeevi
205
4
0
31 Jan 2023
Best Arm Identification with Contextual Information under a Small Gap
Best Arm Identification with Contextual Information under a Small Gap
Masahiro Kato
Masaaki Imaizumi
Takuya Ishihara
T. Kitagawa
477
3
0
15 Sep 2022
Risk-aware linear bandits with convex loss
Risk-aware linear bandits with convex lossInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Patrick Saux
Odalric-Ambrym Maillard
274
3
0
15 Sep 2022
Double Doubly Robust Thompson Sampling for Generalized Linear Contextual
  Bandits
Double Doubly Robust Thompson Sampling for Generalized Linear Contextual BanditsAAAI Conference on Artificial Intelligence (AAAI), 2022
Wonyoung Hedge Kim
Kyungbok Lee
M. Paik
345
18
0
15 Sep 2022
Squeeze All: Novel Estimator and Self-Normalized Bound for Linear
  Contextual Bandits
Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Wonyoung Hedge Kim
M. Paik
Min-whan Oh
357
6
0
11 Jun 2022
Finite-Time Regret of Thompson Sampling Algorithms for Exponential
  Family Multi-Armed Bandits
Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed BanditsNeural Information Processing Systems (NeurIPS), 2022
Tianyuan Jin
Pan Xu
X. Xiao
Anima Anandkumar
254
16
0
07 Jun 2022
Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget
  under a Small Gap
Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap
Masahiro Kato
Kaito Ariu
Masaaki Imaizumi
and Masahiro Nomura
Chao Qin
866
3
0
12 Jan 2022
Bandit Algorithms for Precision Medicine
Bandit Algorithms for Precision Medicine
Yangyi Lu
Ziping Xu
Ambuj Tewari
296
17
0
10 Aug 2021
1
Page 1 of 1