Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

4 April 2024

Papers citing "Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning"

4 / 4 papers shown

Title
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment C. Voelcker Marcel Hussing Eric Eaton OffRL 16 3 0 11 Oct 2024
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF Tengyang Xie Dylan J. Foster Akshay Krishnamurthy Corby Rosset Ahmed Hassan Awadallah Alexander Rakhlin 36 32 0 31 May 2024
Oracle-Efficient Reinforcement Learning for Max Value Ensembles Marcel Hussing Michael Kearns Aaron Roth S. B. Sengupta Jessica Sorrell 22 0 0 27 May 2024
On Learning Parities with Dependent Noise Noah Golowich Ankur Moitra Dhruv Rohatgi 18 1 0 17 Apr 2024