Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.02987
Cited By
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models
3 July 2024
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models"
6 / 6 papers shown
Title
No Free Lunch with Guardrails
Divyanshu Kumar
Nitin Aravind Birur
Tanay Baswa
Sahil Agarwal
P. Harshangi
54
1
0
01 Apr 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
84
1
0
09 Oct 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Zhichen Dong
Zhanhui Zhou
Chao Yang
Jing Shao
Yu Qiao
ELM
52
55
0
14 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
117
301
0
19 Sep 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
1