Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.10430
Cited By
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models
18 March 2023
Yiran Ye
Thai Le
Dongwon Lee
AAML
DeLMO
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models"
2 / 2 papers shown
Title
Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models
Jiang Zhang
Qiong Wu
Yiming Xu
Cheng Cao
Zheng Du
Konstantinos Psounis
28
14
0
13 Dec 2023
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
245
914
0
21 Apr 2018
1