Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.11456
Cited By
HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models
18 March 2024
H. Nghiem
Hal Daumé
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models"
5 / 5 papers shown
Title
Content Moderation by LLM: From Accuracy to Legitimacy
Tao Huang
AILaw
32
3
0
05 Sep 2024
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Shashank Gupta
Vaishnavi Shrivastava
A. Deshpande
A. Kalyan
Peter Clark
Ashish Sabharwal
Tushar Khot
114
49
0
08 Nov 2023
Probing LLMs for hate speech detection: strengths and vulnerabilities
Sarthak Roy
Ashish Harshavardhan
Animesh Mukherjee
Punyajoy Saha
63
31
0
19 Oct 2023
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
85
233
0
11 Sep 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
882
0
18 Apr 2021
1