Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.14866
Cited By
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
28 June 2021
Wenshuo Guo
Kumar Krishna Agrawal
Aditya Grover
Vidya Muthukumar
A. Pananjady
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits"
7 / 7 papers shown
Title
Eliciting Risk Aversion with Inverse Reinforcement Learning via Interactive Questioning
Ziteng Cheng
Anthony Coache
S. Jaimungal
18
0
0
16 Aug 2023
Diffusion Models for Black-Box Optimization
S. Krishnamoorthy
Satvik Mashkaria
Aditya Grover
DiffM
21
50
0
12 Jun 2023
MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning
Arundhati Banerjee
Soham R. Phade
Stefano Ermon
Stephan Zheng
OffRL
17
1
0
10 Apr 2023
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning
Tung Nguyen
Qinqing Zheng
Aditya Grover
OffRL
19
6
0
11 Oct 2022
Generative Pretraining for Black-Box Optimization
S. Krishnamoorthy
Satvik Mashkaria
Aditya Grover
OffRL
AI4CE
35
26
0
22 Jun 2022
Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection
Peter Henderson
Ben Chugg
Brandon R. Anderson
Kristen M. Altenburger
Alex Turk
J. Guyton
Jacob Goldin
Daniel E. Ho
OffRL
16
9
0
25 Apr 2022
Inverse Contextual Bandits: Learning How Behavior Evolves over Time
Alihan Huyuk
Daniel Jarrett
M. Schaar
CML
OffRL
22
11
0
13 Jul 2021
1