Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.13435
Cited By
Lightweight Safety Classification Using Pruned Language Models
18 December 2024
Mason Sawtell
Tula Masterman
Sandi Besen
Jim Brown
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lightweight Safety Classification Using Pruned Language Models"
1 / 1 papers shown
Title
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun-Xiong Xia
Tianyi Wu
Zhiwei Xue
Y. Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
131
14
0
30 Jan 2025
1