Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.12418
Cited By
HateModerate: Testing Hate Speech Detectors against Content Moderation Policies
23 July 2023
Jiangrui Zheng
Xueqing Liu
Guanqun Yang
Mirazul Haque
Xing Qian
Ravishka Rathnasuriya
Wei Yang
G. Budhrani
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HateModerate: Testing Hate Speech Detectors against Content Moderation Policies"
4 / 4 papers shown
Title
On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs
Herun Wan
Minnan Luo
Zhixiong Su
Guang Dai
Xiang Zhao
DeLMO
17
0
0
16 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
55
1
0
09 Oct 2024
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
85
233
0
11 Sep 2021
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak
Jason Naradowsky
Aparajita Haldar
Rachel Rudinger
Benjamin Van Durme
187
574
0
02 May 2018
1