ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.13554
  4. Cited By
Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under
  Batch Update Policy

Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy

23 October 2020
Masahiro Kato
Yusuke Kaneko
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy"

2 / 2 papers shown
Title
Adaptive Doubly Robust Estimator from Non-stationary Logging Policy
  under a Convergence of Average Probability
Adaptive Doubly Robust Estimator from Non-stationary Logging Policy under a Convergence of Average Probability
Masahiro Kato
OffRL
182
0
0
17 Feb 2021
Policy design in experiments with unknown interference
Policy design in experiments with unknown interference
Davide Viviano
Jess Rudder
415
10
0
16 Nov 2020
1