Papers citing 'A Benchmark Dataset for Learning to Intervene in Online Hate Speech'

Title
Beating Harmful Stereotypes Through Facts: RAG-based Counter-speech Generation Greta Damo Elena Cabrio S. Villata 60 0 0 14 Oct 2025
Bridging Gaps in Hate Speech Detection: Meta-Collections and Benchmarks for Low-Resource Iberian Languages Paloma Piot José Ramom Pichel Campos Javier Parapar 76 1 0 13 Oct 2025
Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches Gautam Kishore Shahi Tim A. Majchrzak 84 0 0 14 Sep 2025
WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data Paloma Piot Diego Sánchez Javier Parapar 84 0 0 01 Sep 2025
Counterspeech for Mitigating the Influence of Media Bias: Comparing Human and LLM-Generated Responses Luyang Lin Zijin Feng Lingzhi Wang Kam-Fai Wong 66 1 0 20 Aug 2025
Cyberbullying Detection via Aggression-Enhanced Prompting Aisha Saeid Anu Sabu Girish A. Koushik Ferrante Neri Diptesh Kanojia 96 0 0 08 Aug 2025
Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech Tanvi Dinkar Aiqi Jiang Simona Frenda Poppy Gerrard-Abbott Nancie Gunson Gavin Abercrombie Ioannis Konstas 70 0 0 06 Aug 2025
Dialogues of Dissent: Thematic and Rhetorical Dimensions of Hate and Counter-Hate Speech in Social Media Conversations Effi Levi Gal Ron Odelia Oshri Shaul R. Shenhav 46 0 0 28 Jul 2025
Web(er) of Hate: A Survey on How Hate Speech Is Typed Luna Wang Andrew Caines Alice Hutchings 86 0 0 19 Jun 2025
Effectiveness of Counter-Speech against Abusive Content: A Multidimensional Annotation and Classification Study Greta Damo Elena Cabrio S. Villata 74 2 0 13 Jun 2025
Counterspeech the ultimate shield! Multi-Conditioned Counterspeech Generation through Attributed Prefix LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Aswini Kumar Padhi Anil Bandhakavi Tanmoy Chakraborty 386 0 0 17 May 2025
Debunking with Dialogue? Exploring AI-Generated Counterspeech to Challenge Conspiracy Theories Mareike Lisker Christina Gottschalk Helena Mihaljević 160 2 0 23 Apr 2025
Graphically Speaking: Unmasking Abuse in Social Media with Conversation InsightsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Célia Nouri Jean-Philippe Cointet Chloé Clavel 171 1 0 02 Apr 2025
Echoes of Discord: Forecasting Hater Reactions to CounterspeechNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025 Xiaoying Song Sharon Lisseth Perez Xinchen Yu Eduardo Blanco Lingzi Hong 825 5 0 17 Feb 2025
ReZG: Retrieval-Augmented Zero-Shot Counter Narrative Generation for Hate Speech Shuyu Jiang Wenyi Tang Xingshu Chen Rui Tang Haizhou Wang Wenxian Wang 90 7 0 31 Dec 2024
A Survey on Automatic Online Hate Speech Detection in Low-Resource Languages Susmita Das Arpita Dutta Kingshuk Roy Abir Mondal Arnab Mukhopadhyay 241 0 0 28 Nov 2024
Assessing the Human Likeness of AI-Generated CounterspeechInternational Conference on Computational Linguistics (COLING), 2024 Xiaoying Song Sujana Mamidisetty Eduardo Blanco Lingzi Hong 190 4 0 14 Oct 2024
Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech CounteringConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 Helena Bonaldi Greta Damo Nicolás Benjamín Ocampo Elena Cabrio S. Villata Marco Guerini 118 11 0 04 Oct 2024
CrowdCounter: A benchmark type-specific multi-target counterspeech datasetConference on Computational Natural Language Learning (CoNLL), 2024 Punyajoy Saha Abhilash Datta Abhik Jana Animesh Mukherjee 207 4 0 02 Oct 2024
Decoding Hate: Exploring Language Models' Reactions to Hate SpeechNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Paloma Piot Javier Parapar 207 4 0 01 Oct 2024
What is the social benefit of hate speech detection research? A Systematic Review Sidney Gig-Jan Wong 107 1 0 26 Sep 2024
Analysis of Socially Unacceptable Discourse with Zero-shot Learning Rayane Ghilene Dimitra Niaouri Michele Linardi Julien Longhi 122 1 0 10 Sep 2024
Towards Generalized Offensive Language Identification A. Dmonte Tejas Arya Tharindu Ranasinghe Marcos Zampieri 174 5 0 26 Jul 2024
Exploring the Plausibility of Hate and Counter Speech Detectors with Explainable AI Adrian Jaques Böck D. Slijepcevic Matthias Zeppelzauer 172 1 0 25 Jul 2024
Computational Politeness in Natural Language Processing: A Survey Priyanshu Priya Mauajama Firdaus Asif Ekbal 173 17 0 28 Jun 2024
COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport Linhao Zhang Li Jin Guangluan Xu Xiaoyu Li Xian Sun 161 3 0 18 Jun 2024
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme InterventionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Prince Jha Raghav Jain Konika Mandal Vasu Sharma Sriparna Saha P. Bhattacharyya 175 17 0 08 Jun 2024
Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario Debajyoti Mazumder Aakash Kumar Jasabanta Patro 139 1 0 31 May 2024
The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content Xinyu Wang S. Koneru Pranav Narayanan Venkit Brett Frischmann Sarah Rajtmajer 278 2 0 17 May 2024
The Unseen Targets of Hate -- A Systematic Review of Hateful Communication DatasetsSocial science computer review (SSCR), 2024 Zehui Yu Indira Sen Dennis Assenmacher Mattia Samory Leon Fröhling Christina Dahn Debora Nozza Claudia Wagner 210 10 0 14 May 2024
LGDE: Local Graph-based Dictionary Expansion Dominik J. Schindler Sneha Jha Xixuan Zhang Kilian Buehling Annett Heft Mauricio Barahona 238 0 0 13 May 2024
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets Manuel Tonneau Diyi Liu Samuel Fraiberger Ralph Schroeder Scott A. Hale Paul Röttger 280 16 0 27 Apr 2024
Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes I. Nejadgholi Kathleen C. Fraser Anna Kerkhof S. Kiritchenko 134 5 0 18 Apr 2024
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction HelpsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Kristina Gligorić Myra Cheng Lucia Zheng Esin Durmus Dan Jurafsky 167 13 0 02 Apr 2024
NLP for Counterspeech against Hate: A Survey and How-To Guide Helena Bonaldi Yi-Ling Chung Gavin Abercrombie Marco Guerini AAML 195 26 0 29 Mar 2024
Outcome-Constrained Large Language Models for Countering Hate Speech Lingzi Hong Pengcheng Luo Eduardo Blanco Xiaoying Song 246 14 0 25 Mar 2024
Behind the Counter: Exploring the Motivations and Barriers of Online Counterspeech Writing Kaike Ping Anisha Kumar Xiaohan Ding Eugenia H Rho 147 4 0 25 Mar 2024
On Zero-Shot Counterspeech Generation by LLMsInternational Conference on Language Resources and Evaluation (LREC), 2024 Punyajoy Saha Aalok Agrawal Abhik Jana Chris Biemann Animesh Mukherjee 189 25 0 22 Mar 2024
HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 H. Nghiem Hal Daumé 325 4 0 18 Mar 2024
Basque and Spanish Counter Narrative Generation: Data Creation and EvaluationInternational Conference on Language Resources and Evaluation (LREC), 2024 Jaione Bengoetxea Yi-Ling Chung Marco Guerini Rodrigo Agerri 215 13 0 14 Mar 2024
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future Minzhi Li Weiyan Shi Caleb Ziems Diyi Yang 206 11 0 28 Feb 2024
Optimizing Language Models for Human Preferences is a Causal Inference Problem Victoria Lin Eli Ben-Michael Louis-Philippe Morency 181 7 0 22 Feb 2024
Modularized Networks for Few-shot Hateful Meme Detection Rui Cao Roy Ka-wei Lee Jing Jiang 164 16 0 19 Feb 2024
A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models Jaylen Jones Lingbo Mo Eric Fosler-Lussier Huan Sun 225 5 0 18 Feb 2024
A Dataset for the Detection of Dehumanizing Language Paul Engelmann Peter Brunsgaard Trolle Christian Hardmeier 103 4 0 13 Feb 2024
Low-Resource Counterspeech Generation for Indic Languages: The Case of Bengali and HindiFindings (Findings), 2024 Mithun Das Saurabh Kumar Pandey Shivansh Sethi Punyajoy Saha Animesh Mukherjee 123 4 0 11 Feb 2024
MetaHate: A Dataset for Unifying Efforts on Hate Speech DetectionInternational Conference on Web and Social Media (ICWSM), 2024 Paloma Piot-Perez-Abadin Patricia Martín-Rodilla Javier Parapar 124 11 0 12 Jan 2024
Explain To Decide: A Human-Centric Review on the Role of Explainable Artificial Intelligence in AI-assisted Decision Making Milad Rogha 146 1 0 11 Dec 2023
KhabarChin: Automatic Detection of Important News in the Persian Language Hamed Hematian Hemati Arash Lagzian M. S. Sartakhti Hamid Beigy Ehsaneddin Asgari 138 2 0 06 Dec 2023
DisCGen: A Framework for Discourse-Informed Counterspeech GenerationInternational Joint Conference on Natural Language Processing (IJCNLP), 2023 Sabit Hassan Malihe Alikhani 188 18 0 29 Nov 2023