ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.00495
  4. Cited By
Configurable Safety Tuning of Language Models with Synthetic Preference
  Data

Configurable Safety Tuning of Language Models with Synthetic Preference Data

30 March 2024
Víctor Gallego
ArXivPDFHTML

Papers citing "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

1 / 1 papers shown
Title
Suppressing Pink Elephants with Direct Principle Feedback
Suppressing Pink Elephants with Direct Principle Feedback
Louis Castricato
Nathan Lile
Suraj Anand
Hailey Schoelkopf
Siddharth Verma
Stella Biderman
58
9
0
12 Feb 2024
1