Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2410.00775
Cited By
Decoding Hate: Exploring Language Models' Reactions to Hate Speech
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
1 October 2024
Paloma Piot
Javier Parapar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Decoding Hate: Exploring Language Models' Reactions to Hate Speech"
5 / 5 papers shown
Title
Evaluating Large Language Models for Detecting Antisemitism
Jay Patel
Hrudayangam Mehta
Jeremy Blackburn
139
0
0
22 Sep 2025
WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data
Paloma Piot
Diego Sánchez
Javier Parapar
84
0
0
01 Sep 2025
Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate
Mikel K. Ngueajio
Flor Miriam Plaza del Arco
Yi-Ling Chung
D. Rawat
Amanda Cercas Curry
139
1
0
04 Jun 2025
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models
Paloma Piot
Patricia Martín-Rodilla
Javier Parapar
182
0
0
04 May 2025
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
304
58
0
08 Apr 2024
1