ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.00126
  4. Cited By
Handling Bias in Toxic Speech Detection: A Survey

Handling Bias in Toxic Speech Detection: A Survey

26 January 2022
Tanmay Garg
Sarah Masud
Tharun Suresh
Tanmoy Chakraborty
ArXivPDFHTML

Papers citing "Handling Bias in Toxic Speech Detection: A Survey"

46 / 46 papers shown
Title
Tackling Social Bias against the Poor: A Dataset and Taxonomy on Aporophobia
Tackling Social Bias against the Poor: A Dataset and Taxonomy on Aporophobia
Georgina Curto
S. Kiritchenko
Muhammad Hammad Fahim Siddiqui
I. Nejadgholi
Kathleen C. Fraser
26
0
0
17 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
29
0
0
14 Apr 2025
Redefining Toxicity: An Objective and Context-Aware Approach for Stress-Level-Based Detection
Redefining Toxicity: An Objective and Context-Aware Approach for Stress-Level-Based Detection
Sergey Berezin
R. Farahbakhsh
Noel Crespi
53
0
0
20 Mar 2025
Lost in Moderation: How Commercial Content Moderation APIs Over- and Under-Moderate Group-Targeted Hate Speech and Linguistic Variations
David Hartmann
Amin Oueslati
Dimitri Staufer
Lena Pohlmann
Simon Munzert
Hendrik Heuer
50
0
0
03 Mar 2025
Safe Spaces or Toxic Places? Content Moderation and Social Dynamics of
  Online Eating Disorder Communities
Safe Spaces or Toxic Places? Content Moderation and Social Dynamics of Online Eating Disorder Communities
Kristina Lerman
Minh Duc Hoang Chu
Charles Bickham
Luca Luceri
Emilio Ferrara
AI4MH
85
0
0
20 Dec 2024
ToxiLab: How Well Do Open-Source LLMs Generate Synthetic Toxicity Data?
ToxiLab: How Well Do Open-Source LLMs Generate Synthetic Toxicity Data?
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Lin Ai
Yinheng Li
Julia Hirschberg
Congrui Huang
85
1
0
18 Nov 2024
Mitigating Biases to Embrace Diversity: A Comprehensive Annotation
  Benchmark for Toxic Language
Mitigating Biases to Embrace Diversity: A Comprehensive Annotation Benchmark for Toxic Language
Xinmeng Hou
24
1
0
17 Oct 2024
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Tommaso Giorgi
Lorenzo Cima
T. Fagni
M. Avvenuti
S. Cresci
42
9
0
10 Oct 2024
Hate Personified: Investigating the role of LLMs in content moderation
Hate Personified: Investigating the role of LLMs in content moderation
Sarah Masud
Sahajpreet Singh
Viktor Hangya
Alexander Fraser
Tanmoy Chakraborty
30
7
0
03 Oct 2024
Exploring Human-LLM Conversations: Mental Models and the Originator of
  Toxicity
Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity
Johannes Schneider
Arianna Casanova Flores
Anne-Catherine Kranz
50
2
0
08 Jul 2024
Watching the Watchers: A Comparative Fairness Audit of Cloud-based
  Content Moderation Services
Watching the Watchers: A Comparative Fairness Audit of Cloud-based Content Moderation Services
David Hartmann
Amin Oueslati
Dimitri Staufer
MLAU
35
1
0
20 Jun 2024
Let Guidelines Guide You: A Prescriptive Guideline-Centered Data
  Annotation Methodology
Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology
Federico Ruggeri
Eleonora Misino
Arianna Muti
Katerina Korre
Paolo Torroni
Alberto Barrón-Cedeño
39
0
0
20 Jun 2024
Toxic Memes: A Survey of Computational Perspectives on the Detection and
  Explanation of Meme Toxicities
Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities
Delfina Sol Martinez Pandiani
Erik Tjong Kim Sang
Davide Ceolin
29
2
0
11 Jun 2024
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of
  Implicit Hate Speech
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
Neemesh Yadav
Sarah Masud
Vikram Goyal
Vikram Goyal
Md. Shad Akhtar
Tanmoy Chakraborty
28
3
0
06 Jun 2024
Hate Speech Detection with Generalizable Target-aware Fairness
Hate Speech Detection with Generalizable Target-aware Fairness
Tong Chen
Danny Wang
Xurong Liang
Marten Risius
Gianluca Demartini
Hongzhi Yin
35
3
0
28 May 2024
FUGNN: Harmonizing Fairness and Utility in Graph Neural Networks
FUGNN: Harmonizing Fairness and Utility in Graph Neural Networks
Renqiang Luo
Huafei Huang
Shuo Yu
Zhuoyang Han
Estrid He
Xiuzhen Zhang
Feng Xia
34
3
0
27 May 2024
Exploring Subjectivity for more Human-Centric Assessment of Social
  Biases in Large Language Models
Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models
Paula Akemi Aoyagui
Sharon Ferguson
Anastasia Kuzminykh
50
0
0
17 May 2024
Algorithmic Fairness: A Tolerance Perspective
Algorithmic Fairness: A Tolerance Perspective
Renqiang Luo
Tao Tang
Feng Xia
Jiaying Liu
Chengpei Xu
Leo Yu Zhang
Wei Xiang
Chengqi Zhang
FaML
74
0
0
26 Apr 2024
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but
  Teaching the Distinction Helps
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps
Kristina Gligorić
Myra Cheng
Lucia Zheng
Esin Durmus
Dan Jurafsky
45
9
0
02 Apr 2024
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
Hyunbyung Park
Sukyung Lee
Gyoungjin Gim
Yungi Kim
Dahyun Kim
Chanjun Park
VLM
36
0
0
28 Mar 2024
Legally Binding but Unfair? Towards Assessing Fairness of Privacy
  Policies
Legally Binding but Unfair? Towards Assessing Fairness of Privacy Policies
Vincent Freiberger
Erik Buchmann
AILaw
32
5
0
12 Mar 2024
Don't Blame the Data, Blame the Model: Understanding Noise and Bias When
  Learning from Subjective Annotations
Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations
Abhishek Anand
Negar Mokhberian
Prathyusha Naresh Kumar
Anweasha Saha
Zihao He
Ashwin Rao
Fred Morstatter
Kristina Lerman
36
6
0
06 Mar 2024
Cross-lingual Offensive Language Detection: A Systematic Review of
  Datasets, Transfer Approaches and Challenges
Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges
Aiqi Jiang
A. Zubiaga
AAML
31
3
0
17 Jan 2024
Disentangling Perceptions of Offensiveness: Cultural and Moral
  Correlates
Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates
Aida Mostafazadeh Davani
Mark Díaz
Dylan K. Baker
Vinodkumar Prabhakaran
AAML
25
14
0
11 Dec 2023
LifeTox: Unveiling Implicit Toxicity in Life Advice
LifeTox: Unveiling Implicit Toxicity in Life Advice
Minbeom Kim
Jahyun Koo
Hwanhee Lee
Joonsuk Park
Hwaran Lee
Kyomin Jung
13
6
0
16 Nov 2023
A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities
  from the Perspective of Annotating Online Toxicity
A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity
Wenbo Zhang
Hangzhi Guo
Ian D Kivlichan
Vinodkumar Prabhakaran
Davis Yadav
Amulya Yadav
23
2
0
07 Nov 2023
On the definition of toxicity in NLP
On the definition of toxicity in NLP
Sergey Berezin
R. Farahbakhsh
Noel Crespi
21
0
0
03 Oct 2023
Focal Inferential Infusion Coupled with Tractable Density Discrimination
  for Implicit Hate Speech Detection
Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection
Sarah Masud
Ashutosh Bajpai
Tanmoy Chakraborty
13
0
0
21 Sep 2023
BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content
  from Wykop.pl web service
BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Anna Kołos
Inez Okulska
Kinga Głąbińska
Agnieszka Karlinska
Emilia Wisnios
Paweł Ellerik
Andrzej Prałat
11
1
0
21 Aug 2023
Causality Guided Disentanglement for Cross-Platform Hate Speech
  Detection
Causality Guided Disentanglement for Cross-Platform Hate Speech Detection
Paras Sheth
Tharindu Kumarage
Raha Moraffah
Amanat Chadha
Huan Liu
29
7
0
03 Aug 2023
Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser
  with Prompts
Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser with Prompts
Shaina Raza
Chen Ding
D. Pandya
FaML
16
2
0
14 Jul 2023
Your spouse needs professional help: Determining the Contextual
  Appropriateness of Messages through Modeling Social Relationships
Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships
David Jurgens
Agrima Seth
Jack E. Sargent
Athena Aghighi
Michael Geraci
22
7
0
06 Jul 2023
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
Lora Aroyo
Alex S. Taylor
Mark Díaz
Christopher Homan
Alicia Parrish
Greg Serapio-García
Vinodkumar Prabhakaran
Ding Wang
29
33
0
20 Jun 2023
PaLM 2 Technical Report
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
92
1,148
0
17 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based
  Approach for Edge Devices
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices
Ahlam Husni Abu Nada
S. Latif
Junaid Qadir
20
4
0
22 Apr 2023
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
  Context Synergized Hyperbolic Network
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network
Sreyan Ghosh
Manan Suri
Purva Chiniya
Utkarsh Tyagi
Sonal Kumar
Dinesh Manocha
27
12
0
02 Mar 2023
BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models
BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models
Rafal Kocielnik
Shrimai Prabhumoye
Vivian Zhang
Roy Jiang
R. Alvarez
Anima Anandkumar
41
6
0
14 Feb 2023
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
62
2,989
0
20 Oct 2022
A Review of Challenges in Machine Learning based Automated Hate Speech
  Detection
A Review of Challenges in Machine Learning based Automated Hate Speech Detection
Abhishek Velankar
H. Patil
Raviraj Joshi
32
8
0
12 Sep 2022
Representation Bias in Data: A Survey on Identification and Resolution
  Techniques
Representation Bias in Data: A Survey on Identification and Resolution Techniques
N. Shahbazi
Yin Lin
Abolfazl Asudeh
H. V. Jagadish
40
67
0
22 Mar 2022
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in
  Spoken Utterances
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances
Sreyan Ghosh
Samden Lepcha
S. Sakshi
R. Shah
S. Umesh
16
14
0
14 Oct 2021
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
250
193
0
15 Sep 2021
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
103
236
0
11 Sep 2021
Towards generalisable hate speech detection: a review on obstacles and
  solutions
Towards generalisable hate speech detection: a review on obstacles and solutions
Wenjie Yin
A. Zubiaga
117
164
0
17 Feb 2021
Fair prediction with disparate impact: A study of bias in recidivism
  prediction instruments
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments
Alexandra Chouldechova
FaML
207
2,084
0
24 Oct 2016
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
245
31,257
0
16 Jan 2013
1