Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.10556
Cited By
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks
16 October 2023
Zihao Li
Xiang Ji
Minshuo Chen
Mengdi Wang
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks"
6 / 6 papers shown
Title
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu-Xiang Wang
OffRL
43
11
0
03 Oct 2022
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
225
495
0
28 Sep 2022
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
235
255
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
The Intrinsic Dimension of Images and Its Impact on Learning
Phillip E. Pope
Chen Zhu
Ahmed Abdelkader
Micah Goldblum
Tom Goldstein
189
256
0
18 Apr 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
329
1,944
0
04 May 2020
1