Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.10995
Cited By
LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content
24 June 2024
Jessica Foo
Shaun Khoo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content"
3 / 3 papers shown
Title
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Gabriel Chua
Shing Yee Chan
Shaun Khoo
75
1
0
20 Nov 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
212
367
0
15 Oct 2021
1