Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.14364
Cited By
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants
27 April 2023
A. Sun
Varun Nair
Elliot Schumacher
Anitha Kannan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants"
8 / 8 papers shown
Title
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
Makesh Narsimhan Sreedhar
Traian Rebedea
Shaona Ghosh
Jiaqi Zeng
Christopher Parisien
ALM
27
4
0
04 Apr 2024
Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information?
A. Sun
Eliott Zemour
Arushi Saxena
Udith Vaidyanathan
Eric Lin
Christian Lau
Vaikkunth Mugunthan
SILM
27
18
0
31 Jul 2023
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
225
495
0
28 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
191
0
15 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
What Makes Good In-Context Examples for GPT-
3
3
3
?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,296
0
17 Jan 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
1