Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.00495
Cited By
Configurable Safety Tuning of Language Models with Synthetic Preference Data
30 March 2024
Víctor Gallego
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Configurable Safety Tuning of Language Models with Synthetic Preference Data"
1 / 1 papers shown
Title
Suppressing Pink Elephants with Direct Principle Feedback
Louis Castricato
Nathan Lile
Suraj Anand
Hailey Schoelkopf
Siddharth Verma
Stella Biderman
58
9
0
12 Feb 2024
1