Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.14088
Cited By
v1
v2
v3 (latest)
Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting
29 April 2020
Guanhua Zhang
Bing Bai
Junqi Zhang
Kun Bai
Conghui Zhu
Tiejun Zhao
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting"
40 / 40 papers shown
Title
Don't Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage Collections
Orfeas Menis Mastromichalakis
Jason Liartis
Kristina Rose
Antoine Isaac
Giorgos Stamou
KELM
15
0
0
30 May 2025
Wisdom from Diversity: Bias Mitigation Through Hybrid Human-LLM Crowds
Axel Abels
Tom Lenaerts
56
0
0
18 May 2025
Detecting Linguistic Bias in Government Documents Using Large language Models
Milena de Swart
Floris den Hengst
Jieying Chen
149
0
0
20 Feb 2025
On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments
Jingchao Fang
Nikos Aréchiga
Keiichi Namaoshi
N. Bravo
Candice L Hogan
David A. Shamma
68
5
0
10 Jul 2024
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers
Salvatore Greco
Ke Zhou
L. Capra
Tania Cerquitelli
Daniele Quercia
53
3
0
01 Jul 2024
Hate Speech Detection with Generalizable Target-aware Fairness
Tong Chen
Danny Wang
Xurong Liang
Marten Risius
Gianluca Demartini
Hongzhi Yin
90
4
0
28 May 2024
Target Span Detection for Implicit Harmful Content
Nazanin Jafari
James Allan
Sheikh Muhammad Sarwar
106
4
0
28 Mar 2024
PEFTDebias : Capturing debiasing information using PEFTs
Sumit Agarwal
Aditya Srikanth Veerubhotla
Srijan Bansal
65
3
0
01 Dec 2023
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models
Yueqing Liang
Lu Cheng
Ali Payani
Kai Shu
65
3
0
15 Nov 2023
A Survey on Fairness in Large Language Models
Yingji Li
Mengnan Du
Rui Song
Xin Wang
Ying Wang
ALM
126
70
0
20 Aug 2023
Sociodemographic Bias in Language Models: A Survey and Forward Path
Vipul Gupta
Pranav Narayanan Venkit
Shomir Wilson
R. Passonneau
86
23
0
13 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
93
8
0
31 May 2023
A Causal View of Entity Bias in (Large) Language Models
Fei Wang
Wen-An Mo
Yiwei Wang
Wenxuan Zhou
Muhao Chen
84
15
0
24 May 2023
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
78
10
0
22 May 2023
Deep Learning on a Healthy Data Diet: Finding Important Examples for Fairness
A. Zayed
Prasanna Parthasarathi
Gonçalo Mordido
Hamid Palangi
Samira Shabanian
Sarath Chandar
54
22
0
20 Nov 2022
TCAB: A Large-Scale Text Classification Attack Benchmark
Kalyani Asthana
Zhouhang Xie
Wencong You
Adam Noack
Jonathan Brophy
Sameer Singh
Daniel Lowd
119
3
0
21 Oct 2022
Explainable Abuse Detection as Intent Classification and Slot Filling
Agostina Calabrese
Bjorn Ross
Mirella Lapata
85
11
0
06 Oct 2022
Fairness Reprogramming
Guanhua Zhang
Yihua Zhang
Yang Zhang
Wenqi Fan
Qing Li
Sijia Liu
Shiyu Chang
AAML
213
40
0
21 Sep 2022
Power of Explanations: Towards automatic debiasing in hate speech detection
Yitao Cai
Arthur Zimek
Gerhard Wunder
Eirini Ntoutsi
56
6
0
07 Sep 2022
Enriching Abusive Language Detection with Community Context
Jana Kurrek
Haji Mohammad Saleem
D. Ruths
47
5
0
16 Jun 2022
Towards a Deep Multi-layered Dialectal Language Analysis: A Case Study of African-American English
Jamell Dacon
53
6
0
03 Jun 2022
Toward Understanding Bias Correlations for Mitigation in NLP
Lu Cheng
Suyu Ge
Huan Liu
62
9
0
24 May 2022
Why only Micro-F1? Class Weighting of Measures for Relation Classification
David Harbecke
Yuxuan Chen
Leonhard Hennig
Christoph Alt
57
20
0
19 May 2022
Theories of "Gender" in NLP Bias Research
Hannah Devinney
Jenny Björklund
H. Björklund
AI4CE
106
76
0
05 May 2022
Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification
Xiaolei Huang
FaML
41
9
0
12 Apr 2022
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations
Yang Trista Cao
Yada Pruksachatkun
Kai-Wei Chang
Rahul Gupta
Varun Kumar
Jwala Dhamala
Aram Galstyan
69
99
0
25 Mar 2022
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists
Giuseppe Attanasio
Debora Nozza
Dirk Hovy
Elena Baralis
50
56
0
17 Mar 2022
Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language
N. Babakov
V. Logacheva
Alexander Panchenko
38
3
0
04 Mar 2022
On Modality Bias Recognition and Reduction
Yangyang Guo
Liqiang Nie
Harry Cheng
Zhiyong Cheng
Mohan S. Kankanhalli
A. Bimbo
72
28
0
25 Feb 2022
Context-Aware Discrimination Detection in Job Vacancies using Computational Language Models
S. Vethman
A. Adhikari
M. D. Boer
J. V. Genabeek
C. Veenman
33
2
0
02 Feb 2022
Handling Bias in Toxic Speech Detection: A Survey
Tanmay Garg
Sarah Masud
Tharun Suresh
Tanmoy Chakraborty
100
98
0
26 Jan 2022
Enhancing Model Robustness and Fairness with Causality: A Regularization Approach
Zhao Wang
Kai Shu
A. Culotta
OOD
112
14
0
03 Oct 2021
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Jiliang Tang
FaML
192
212
0
12 Jul 2021
A Survey of Race, Racism, and Anti-Racism in NLP
Anjalie Field
Su Lin Blodgett
Zeerak Talat
Yulia Tsvetkov
86
124
0
21 Jun 2021
Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification
Yada Pruksachatkun
Satyapriya Krishna
Jwala Dhamala
Rahul Gupta
Kai-Wei Chang
73
33
0
21 Jun 2021
The Authors Matter: Understanding and Mitigating Implicit Bias in Deep Text Classification
Haochen Liu
Wei Jin
Hamid Karimi
Zitao Liu
Jiliang Tang
54
32
0
06 May 2021
Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation
N. Babakov
V. Logacheva
Olga Kozlova
Nikita Semenov
Alexander Panchenko
64
10
0
09 Mar 2021
On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning
Xisen Jin
Francesco Barbieri
Brendan Kennedy
Aida Mostafazadeh Davani
Leonardo Neves
Xiang Ren
90
5
0
24 Oct 2020
Why Attentions May Not Be Interpretable?
Bing Bai
Jian Liang
Guanhua Zhang
Hao Li
Kun Bai
Fei Wang
FAtt
86
60
0
10 Jun 2020
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna M. Wallach
159
1,257
0
28 May 2020
1