Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.00591
Cited By
Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of Generated Hate Speech
1 September 2021
Tomer Wullach
A. Adler
Einat Minkov
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of Generated Hate Speech"
14 / 14 papers shown
Title
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
Xinyue Shen
Yixin Wu
Y. Qu
Michael Backes
Savvas Zannettou
Yang Zhang
120
7
0
28 Jan 2025
A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
Camilla Casula
Sara Tonelli
54
1
0
10 Oct 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
148
7
0
23 Sep 2024
Shortchanged: Uncovering and Analyzing Intimate Partner Financial Abuse in Consumer Complaints
Arkaprabha Bhattacharya
Kevin Lee
Vineeth Ravi
Jessica Staddon
Rosanna Bellini
23
2
0
20 Mar 2024
Improving Cross-Domain Hate Speech Generalizability with Emotion Knowledge
Shi Yin Hong
Susan Gauch
62
2
0
24 Nov 2023
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
60
11
0
16 Nov 2023
Simple synthetic data reduces sycophancy in large language models
Jerry W. Wei
Da Huang
Yifeng Lu
Denny Zhou
Quoc V. Le
114
74
0
07 Aug 2023
Detecting Multidimensional Political Incivility on Social Media
Sagi Pendzel
Nir Lotan
Alon Zoizner
Einat Minkov
29
1
0
24 May 2023
Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
Rabiul Awal
Roy Ka-wei Lee
Eshaan Tanwar
Tanmay Garg
Tanmoy Chakraborty
73
28
0
04 Mar 2023
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
270
99
0
06 Oct 2022
SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice
Mohit Singhal
Chen Ling
Pujan Paudel
Poojitha Thota
Nihal Kumarswamy
Gianluca Stringhini
Shirin Nilizadeh
156
33
0
29 Jun 2022
CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection
Souvic Chakraborty
Parag Dutta
Sumegh Roychowdhury
Animesh Mukherjee
31
8
0
13 Apr 2022
Going Extreme: Comparative Analysis of Hate Speech in Parler and Gab
Abraham Israeli
Oren Tsur
80
1
0
27 Jan 2022
Character-level HyperNetworks for Hate Speech Detection
Tomer Wullach
A. Adler
Einat Minkov
61
14
0
11 Nov 2021
1