Breadcrumbs to the Goal: Goal-Conditioned Exploration from
Human-in-the-Loop Feedback

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

20 July 2023

Tao Chen

Abhishek Gupta

Papers citing "Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback"

6 / 6 papers shown

Title
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization Guanlin Liu Kaixuan Ji Ning Dai Zheng Wu Chen Dun Q. Gu Lin Yan Quanquan Gu Lin Yan OffRL LRM 48 9 0 11 Oct 2024
Random Latent Exploration for Deep Reinforcement Learning Srinath Mahankali Zhang-Wei Hong Ayush Sekhari Alexander Rakhlin Pulkit Agrawal 33 3 0 18 Jul 2024
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models Cong Lu Shengran Hu Jeff Clune LLMAG 44 10 0 24 May 2024
Crowd-PrefRL: Preference-Based Reward Learning from Crowds David Chhan Ellen R. Novoseller Vernon J. Lawhern 29 5 0 17 Jan 2024
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 313 11,953 0 04 Mar 2022
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 340 1,955 0 04 May 2020