Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

10 November 2022

Papers citing "Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control"

6 / 6 papers shown

Title
Reward Gaming in Conditional Text Generation Richard Yuanzhe Pang Vishakh Padmakumar Thibault Sellam Ankur P. Parikh He He 21 24 0 16 Nov 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 245 1,986 0 31 Dec 2020
Fine-Tuning Language Models from Human Preferences Daniel M. Ziegler Nisan Stiennon Jeff Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano G. Irving ALM 275 1,583 0 18 Sep 2019
The Woman Worked as a Babysitter: On Biases in Language Generation Emily Sheng Kai-Wei Chang Premkumar Natarajan Nanyun Peng 206 615 0 03 Sep 2019
Deep Reinforcement Learning for Dialogue Generation Jiwei Li Will Monroe Alan Ritter Michel Galley Jianfeng Gao Dan Jurafsky 198 1,325 0 05 Jun 2016