ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.00490
  4. Cited By
Online Markov Decision Processes with Aggregate Bandit Feedback

Online Markov Decision Processes with Aggregate Bandit Feedback

31 January 2021
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Online Markov Decision Processes with Aggregate Bandit Feedback"

8 / 8 papers shown
Title
Online Episodic Convex Reinforcement Learning
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
194
0
0
12 May 2025
Multi-turn Reinforcement Learning from Preference Human Feedback
Multi-turn Reinforcement Learning from Preference Human Feedback
Lior Shani
Aviv Rosenberg
Asaf B. Cassel
Oran Lang
Daniele Calandriello
...
Bilal Piot
Idan Szpektor
Avinatan Hassidim
Yossi Matias
Rémi Munos
97
34
0
23 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
64
4
0
13 May 2024
Online Resource Allocation in Episodic Markov Decision Processes
Online Resource Allocation in Episodic Markov Decision Processes
Duksang Lee
William Overman
Dabeen Lee
58
1
0
18 May 2023
Dynamic Regret of Online Markov Decision Processes
Dynamic Regret of Online Markov Decision Processes
Peng Zhao
Longfei Li
Zhi Zhou
OffRL
95
17
0
26 Aug 2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
  Reward
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
70
13
0
13 Jun 2022
Reinforcement Learning with a Terminator
Reinforcement Learning with a Terminator
Guy Tennenholtz
Nadav Merlis
Lior Shani
Shie Mannor
Uri Shalit
Gal Chechik
Assaf Hallak
Gal Dalal
59
5
0
30 May 2022
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Niladri S. Chatterji
Aldo Pacchiano
Peter L. Bartlett
Michael I. Jordan
OffRL
79
26
0
29 May 2021
1