
Ruddit: Norms of Offensiveness for English Reddit Comments
Papers citing "Ruddit: Norms of Offensiveness for English Reddit Comments"
23 / 23 papers shown
Title |
---|
![]() Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations Hakan Inan Kartikeya Upasani Jianfeng Chi Rashi Rungta Krithika Iyer ...Michael Tontchev Qing Hu Brian Fuller Davide Testuggine Madian Khabsa |