ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.10289
  4. Cited By
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

18 December 2020
Binny Mathew
Punyajoy Saha
Seid Muhie Yimam
Chris Biemann
Pawan Goyal
Animesh Mukherjee
ArXivPDFHTML

Papers citing "HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection"

50 / 280 papers shown
Title
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in
  LLMs
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs
S. Kadhe
Anisa Halimi
Ambrish Rawat
Nathalie Baracaldo
MU
22
7
0
12 Dec 2023
Toxic language detection: a systematic review of Arabic datasets
Toxic language detection: a systematic review of Arabic datasets
Imene Bensalem
Paolo Rosso
Hanane Zitouni
32
4
0
12 Dec 2023
A Text-to-Text Model for Multilingual Offensive Language Identification
A Text-to-Text Model for Multilingual Offensive Language Identification
Tharindu Ranasinghe
Marcos Zampieri
27
3
0
06 Dec 2023
Characterizing Large Language Model Geometry Helps Solve Toxicity
  Detection and Generation
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation
Randall Balestriero
Romain Cosentino
Sarath Shekkizhar
28
2
0
04 Dec 2023
Improving Cross-Domain Hate Speech Generalizability with Emotion
  Knowledge
Improving Cross-Domain Hate Speech Generalizability with Emotion Knowledge
Shi Yin Hong
Susan Gauch
40
2
0
24 Nov 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
47
0
0
16 Nov 2023
Generative AI for Hate Speech Detection: Evaluation and Findings
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
33
11
0
16 Nov 2023
Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens
  Contributing to Explicit Hate in English by Span Detection
Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection
Sarah Masud
Mohammad Aflah Khan
Md. Shad Akhtar
Tanmoy Chakraborty
32
3
0
16 Nov 2023
The Uli Dataset: An Exercise in Experience Led Annotation of oGBV
The Uli Dataset: An Exercise in Experience Led Annotation of oGBV
Arnav Arora
Maha Jinadoss
Cheshta Arora
Denny George
Brindaalakshmi
...
Ambika Tandon
Rishav Thakker
Rahul Dev Korra
Aatman Vaidya
Tarunima Prabhakar
26
1
0
15 Nov 2023
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large
  Language Models
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models
Carlos Alejandro Aguirre
Kuleen Sasse
Isabel Cachola
Mark Dredze
32
1
0
14 Nov 2023
Detecting and Correcting Hate Speech in Multimodal Memes with Large
  Visual Language Model
Detecting and Correcting Hate Speech in Multimodal Memes with Large Visual Language Model
Minh-Hao Van
Xintao Wu
VLM
MLLM
37
10
0
12 Nov 2023
GRASP: A Disagreement Analysis Framework to Assess Group Associations in
  Perspectives
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives
Vinodkumar Prabhakaran
Christopher Homan
Lora Aroyo
Aida Mostafazadeh Davani
Alicia Parrish
Alex S. Taylor
Mark Díaz
Ding Wang
Greg Serapio-García
47
9
0
09 Nov 2023
Factoring Hate Speech: A New Annotation Framework to Study Hate Speech
  in Social Media
Factoring Hate Speech: A New Annotation Framework to Study Hate Speech in Social Media
Gal Ron
Effi Levi
Odelia Oshri
Shaul R. Shenhav
25
2
0
07 Nov 2023
Explainable Identification of Hate Speech towards Islam using Graph
  Neural Networks
Explainable Identification of Hate Speech towards Islam using Graph Neural Networks
Azmine Toushik Wasi
33
0
0
02 Nov 2023
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
Yongjin Yang
Joonkee Kim
Yujin Kim
Namgyu Ho
James Thorne
Se-Young Yun
27
21
0
01 Nov 2023
Text-Transport: Toward Learning Causal Effects of Natural Language
Text-Transport: Toward Learning Causal Effects of Natural Language
Victoria Lin
Louis-Philippe Morency
Eli Ben-Michael
6
4
0
31 Oct 2023
On the Interplay between Fairness and Explainability
On the Interplay between Fairness and Explainability
Stephanie Brandl
Emanuele Bugliarello
Ilias Chalkidis
FaML
27
4
0
25 Oct 2023
K-HATERS: A Hate Speech Detection Corpus in Korean with Target-Specific
  Ratings
K-HATERS: A Hate Speech Detection Corpus in Korean with Target-Specific Ratings
Chaewon Park
Soohwan Kim
Kyubyong Park
Kunwoo Park
35
4
0
24 Oct 2023
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for
  Social Media NLP Research
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research
Dimosthenis Antypas
Asahi Ushio
Francesco Barbieri
Leonardo Neves
Kiamehr Rezaee
Luis Espinosa-Anke
Jiaxin Pei
Jose Camacho-Collados
35
9
0
23 Oct 2023
Probing LLMs for hate speech detection: strengths and vulnerabilities
Probing LLMs for hate speech detection: strengths and vulnerabilities
Sarthak Roy
Ashish Harshavardhan
Animesh Mukherjee
Punyajoy Saha
63
33
0
19 Oct 2023
Language Agents for Detecting Implicit Stereotypes in Text-to-image
  Models at Scale
Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale
Qichao Wang
Tian Bian
Yian Yin
Tingyang Xu
Hong Cheng
Helen M. Meng
Zibin Zheng
Liang Chen
Bingzhe Wu
VLM
DiffM
36
3
0
18 Oct 2023
VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
Yuji Zhang
Jing Li
Wenjie Li
VLM
32
11
0
16 Oct 2023
InterroLang: Exploring NLP Models and Datasets through Dialogue-based
  Explanations
InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations
Nils Feldhus
Qianli Wang
Tatiana Anikina
Sahil Chopra
Cennet Oguz
Sebastian Möller
42
11
0
09 Oct 2023
Hate Speech Detection in Limited Data Contexts using Synthetic Data
  Generation
Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation
Aman Khullar
Daniel K. Nkemelu
Cuong V. Nguyen
Michael L. Best
45
2
0
04 Oct 2023
It HAS to be Subjective: Human Annotator Simulation via Zero-shot
  Density Estimation
It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation
Wen Wu
Wenlin Chen
Chuxu Zhang
P. Woodland
21
1
0
30 Sep 2023
Focal Inferential Infusion Coupled with Tractable Density Discrimination
  for Implicit Hate Speech Detection
Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection
Sarah Masud
Ashutosh Bajpai
Tanmoy Chakraborty
13
0
0
21 Sep 2023
Zero-Shot Robustification of Zero-Shot Models
Zero-Shot Robustification of Zero-Shot Models
Dyah Adila
Changho Shin
Lin Cai
Frederic Sala
51
19
0
08 Sep 2023
On the Challenges of Building Datasets for Hate Speech Detection
On the Challenges of Building Datasets for Hate Speech Detection
Vitthal Bhandari
20
1
0
06 Sep 2023
Explainability for Large Language Models: A Survey
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Mengnan Du
LRM
36
415
0
02 Sep 2023
Exploring Cross-Cultural Differences in English Hate Speech Annotations:
  From Dataset Construction to Analysis
Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis
Nayeon Lee
Chani Jung
Jun-Hee Myung
Jiho Jin
Jose Camacho-Collados
Juho Kim
Alice Oh
52
14
0
31 Aug 2023
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language
  Model Bias
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias
Vipul Gupta
Pranav Narayanan Venkit
Hugo Laurenccon
Shomir Wilson
R. Passonneau
48
12
0
24 Aug 2023
A Survey on Fairness in Large Language Models
A Survey on Fairness in Large Language Models
Yingji Li
Mengnan Du
Rui Song
Xin Wang
Ying Wang
ALM
57
60
0
20 Aug 2023
An Image is Worth a Thousand Toxic Words: A Metamorphic Testing
  Framework for Content Moderation Software
An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Wenxuan Wang
Jingyuan Huang
Jen-tse Huang
Chang Chen
Jiazhen Gu
Pinjia He
Michael R. Lyu
VLM
36
6
0
18 Aug 2023
Through the Lens of Core Competency: Survey on Evaluation of Large
  Language Models
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
31
9
0
15 Aug 2023
You Only Prompt Once: On the Capabilities of Prompt Learning on Large
  Language Models to Tackle Toxic Content
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content
Xinlei He
Savvas Zannettou
Yun Shen
Yang Zhang
CLL
29
37
0
10 Aug 2023
Causality Guided Disentanglement for Cross-Platform Hate Speech
  Detection
Causality Guided Disentanglement for Cross-Platform Hate Speech Detection
Paras Sheth
Tharindu Kumarage
Raha Moraffah
Amanat Chadha
Huan Liu
34
8
0
03 Aug 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative
  Information-Seeking with Attribution
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
Ehsan Kamalloo
A. Jafari
Xinyu Crystina Zhang
Nandan Thakur
Jimmy J. Lin
32
42
0
31 Jul 2023
On the Learning Dynamics of Attention Networks
On the Learning Dynamics of Attention Networks
Rahul Vashisht
H. G. Ramaswamy
13
1
0
25 Jul 2023
HateModerate: Testing Hate Speech Detectors against Content Moderation
  Policies
HateModerate: Testing Hate Speech Detectors against Content Moderation Policies
Jiangrui Zheng
Xueqing Liu
Guanqun Yang
Mirazul Haque
Xing Qian
Ravishka Rathnasuriya
Wei Yang
G. Budhrani
52
3
0
23 Jul 2023
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph
  Transformers to Detect Hate Speech on Social Media
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Liam Hebert
Gaurav Sahu
Yuxuan Guo
Nanda Kishore Sreenivas
Lukasz Golab
Robin Cohen
23
10
0
18 Jul 2023
Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical
  Evaluation
Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
Dimosthenis Antypas
Jose Camacho-Collados
53
23
0
04 Jul 2023
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
Lora Aroyo
Alex S. Taylor
Mark Díaz
Christopher Homan
Alicia Parrish
Greg Serapio-García
Vinodkumar Prabhakaran
Ding Wang
34
33
0
20 Jun 2023
Cross-Domain Toxic Spans Detection
Cross-Domain Toxic Spans Detection
Stefan F. Schouten
Baran Barbarestani
Wondimagegnhue Tufa
Piek Vossen
I. Markov
18
2
0
16 Jun 2023
PEACE: Cross-Platform Hate Speech Detection- A Causality-guided
  Framework
PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework
Paras Sheth
Tharindu Kumarage
Raha Moraffah
Amanat Chadha
Huan Liu
36
7
0
15 Jun 2023
Strategies to exploit XAI to improve classification systems
Strategies to exploit XAI to improve classification systems
Andrea Apicella
Luca Di Lorenzo
Francesco Isgrò
A. Pollastro
R. Prevete
11
9
0
09 Jun 2023
DecompX: Explaining Transformers Decisions by Propagating Token
  Decomposition
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition
Ali Modarressi
Mohsen Fayyaz
Ehsan Aghazadeh
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
38
26
0
05 Jun 2023
Being Right for Whose Right Reasons?
Being Right for Whose Right Reasons?
Terne Sasha Thorn Jakobsen
Laura Cabello
Anders Søgaard
39
10
0
01 Jun 2023
Exploiting Explainability to Design Adversarial Attacks and Evaluate
  Attack Resilience in Hate-Speech Detection Models
Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models
Pranath Reddy Kumbam
Sohaib Uddin Syed
Prashanth Thamminedi
S. Harish
Ian Perera
Bonnie J. Dorr
AAML
29
1
0
29 May 2023
Evaluating GPT-3 Generated Explanations for Hateful Content Moderation
Evaluating GPT-3 Generated Explanations for Hateful Content Moderation
H. Wang
Ming Shan Hee
Rabiul Awal
K. T. W. Choo
Roy Ka-wei Lee
24
42
0
28 May 2023
Detecting Multidimensional Political Incivility on Social Media
Detecting Multidimensional Political Incivility on Social Media
Sagi Pendzel
Nir Lotan
Alon Zoizner
Einat Minkov
19
1
0
24 May 2023
Previous
123456
Next