Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.04009
Cited By
Automated Hate Speech Detection and the Problem of Offensive Language
11 March 2017
Thomas Davidson
Dana Warmsley
M. Macy
Ingmar Weber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Automated Hate Speech Detection and the Problem of Offensive Language"
16 / 16 papers shown
Title
AmpleHate: Amplifying the Attention for Versatile Implicit Hate Detection
Yejin Lee
Joonghyuk Hahn
Hyeseon Ahn
Yo-Sub Han
32
0
0
26 May 2025
From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs
Muhammad Farid Adilazuarda
Chen Cecilia Liu
Iryna Gurevych
Alham Fikri Aji
80
0
0
22 May 2025
Evolving Hate Speech Online: An Adaptive Framework for Detection and Mitigation
Shiza Ali
Jeremy Blackburn
Gianluca Stringhini
73
0
0
24 Feb 2025
Echoes of Discord: Forecasting Hater Reactions to Counterspeech
Xiaoying Song
Sharon Lisseth Perez
Xinchen Yu
Eduardo Blanco
Lingzi Hong
353
0
0
17 Feb 2025
Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet
Berk Atil
Vipul Gupta
Sarkar Snigdha Sarathi Das
R. Passonneau
324
0
0
07 Feb 2025
TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection
Yang Cao
Sikun Yang
Chen Li
Haolong Xiang
Lianyong Qi
Bo Liu
Rongsheng Li
Ming Liu
71
0
0
21 Jan 2025
DefVerify: Do Hate Speech Models Reflect Their Dataset's Definition?
Urja Khurana
Eric T. Nalisnick
Antske Fokkens
81
1
0
21 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
340
3
0
17 Oct 2024
TaeBench: Improving Quality of Toxic Adversarial Examples
Xuan Zhu
Dmitriy Bespalov
Liwen You
Ninad Kulkarni
Yanjun Qi
AAML
76
0
0
08 Oct 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
68
7
0
23 Sep 2024
Identity-related Speech Suppression in Generative AI Content Moderation
Oghenefejiro Isaacs Anigboro
Charlie M. Crawford
Danaë Metaxa
Sorelle A. Friedler
Sorelle A. Friedler
56
0
0
09 Sep 2024
A Causal Framework for Evaluating Deferring Systems
Filippo Palomba
Andrea Pugnana
Jose M. Alvarez
Salvatore Ruggieri
CML
69
4
0
29 May 2024
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets
Manuel Tonneau
Diyi Liu
Samuel Fraiberger
Ralph Schroeder
Scott A. Hale
Paul Röttger
44
6
0
27 Apr 2024
Specification Overfitting in Artificial Intelligence
Benjamin Roth
Pedro Henrique Luz de Araujo
Yuxi Xia
Saskia Kaltenbrunner
Christoph Korab
117
1
0
13 Mar 2024
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection
Ahmad Nasir
Aadish Sharma
Kokil Jaidka
Saifuddin Ahmed
51
3
0
29 Oct 2023
AtteSTNet -- An attention and subword tokenization based approach for code-switched text hate speech detection
Geet Shingi
Vedangi Wagh
88
0
0
10 Dec 2021
1