Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.04561
Cited By
Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification
11 March 2019
Daniel Borkan
Lucas Dixon
Jeffrey Scott Sorensen
Nithum Thain
Lucy Vasserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification"
50 / 125 papers shown
Title
Enforcing Fairness Where It Matters: An Approach Based on Difference-of-Convex Constraints
Yutian He
Yankun Huang
Yao Yao
Qihang Lin
FaML
27
0
0
18 May 2025
Fine-Grained Bias Exploration and Mitigation for Group-Robust Classification
Miaoyun Zhao
Qiang Zhang
C. Li
36
0
0
11 May 2025
Teaching Models to Understand (but not Generate) High-risk Data
Ryan Yixiang Wang
Matthew Finlayson
Luca Soldaini
Swabha Swayamdipta
Robin Jia
183
0
0
05 May 2025
Validating LLM-as-a-Judge Systems in the Absence of Gold Labels
Luke M. Guerdan
Solon Barocas
Kenneth Holstein
Hanna M. Wallach
Zhiwei Steven Wu
Alexandra Chouldechova
ALM
ELM
302
0
0
13 Mar 2025
Out-of-Distribution Detection using Synthetic Data Generation
Momin Abbas
Muneeza Azmat
R. Horesh
Mikhail Yurochkin
49
1
0
05 Feb 2025
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification
Tom A. Lamb
Adam Davies
Alasdair Paren
Philip Torr
Francesco Pinto
54
0
0
30 Oct 2024
Compositional Risk Minimization
Divyat Mahajan
Mohammad Pezeshki
Ioannis Mitliagkas
Kartik Ahuja
Pascal Vincent
Pascal Vincent
31
3
0
08 Oct 2024
Identity-related Speech Suppression in Generative AI Content Moderation
Oghenefejiro Isaacs Anigboro
Charlie M. Crawford
Danaë Metaxa
Sorelle A. Friedler
Sorelle A. Friedler
26
0
0
09 Sep 2024
Towards Generalized Offensive Language Identification
A. Dmonte
Tejas Arya
Tharindu Ranasinghe
Marcos Zampieri
52
3
0
26 Jul 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs
S. Kadhe
Farhan Ahmed
Dennis Wei
Nathalie Baracaldo
Inkit Padhi
MoMe
MU
33
7
0
17 Jun 2024
Automated Program Repair: Emerging trends pose and expose problems for benchmarks
J. Renzullo
Pemma Reiter
Westley Weimer
Stephanie Forrest
47
1
0
08 May 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
Beyza Ermis
50
7
0
06 Mar 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
42
15
0
08 Feb 2024
Understanding Domain Generalization: A Noise Robustness Perspective
Rui Qiao
K. H. Low
OOD
39
6
0
26 Jan 2024
Enhancing Robustness of Foundation Model Representations under Provenance-related Distribution Shifts
Xiruo Ding
Zhecheng Sheng
Brian Hur
Feng Chen
Serguei V. S. Pakhomov
Trevor Cohen
OOD
23
0
0
09 Dec 2023
Using Early Readouts to Mediate Featural Bias in Distillation
Rishabh Tiwari
D. Sivasubramanian
Anmol Reddy Mekala
Ganesh Ramakrishnan
Pradeep Shenoy
26
5
0
28 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching
Nico Daheim
Thomas Möllenhoff
Edoardo Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
MoMe
FedML
37
45
0
19 Oct 2023
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Luiza Amador Pozzobon
Beyza Ermis
Patrick Lewis
Sara Hooker
36
20
0
11 Oct 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MA
ELM
AI4MH
45
66
0
21 Sep 2023
Bias Amplification Enhances Minority Group Performance
Gaotang Li
Jiarui Liu
Wei Hu
30
5
0
13 Sep 2023
Zero-Shot Robustification of Zero-Shot Models
Dyah Adila
Changho Shin
Lin Cai
Frederic Sala
48
19
0
08 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
Fatma Elsafoury
29
3
0
31 Aug 2023
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions
Reem I. Masoud
Ziquan Liu
Martin Ferianc
Philip C. Treleaven
Miguel R. D. Rodrigues
27
50
0
25 Aug 2023
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation
Xinshuo Hu
Dongfang Li
Baotian Hu
Zihao Zheng
Zhenyu Liu
Hao Fei
KELM
MU
40
26
0
16 Aug 2023
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification
K. Chernyshev
E. Garanina
Duygu Bayram
Qiankun Zheng
Lukas Edman
13
0
0
08 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Yuchen Zhang
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
52
75
0
07 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
38
8
0
31 May 2023
Analyzing Text Representations by Measuring Task Alignment
César González-Gutiérrez
Audi Primadhanty
Francesco Cazzaro
A. Quattoni
23
1
0
31 May 2023
Rectifying Group Irregularities in Explanations for Distribution Shift
Adam Stein
Yinjun Wu
Eric Wong
Mayur Naik
42
1
0
25 May 2023
Understanding and Mitigating Spurious Correlations in Text Classification with Neighborhood Analysis
Oscar Chew
Hsuan-Tien Lin
Kai-Wei Chang
Kuan-Hao Huang
40
5
0
23 May 2023
Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization
Ting Wu
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
51
2
0
20 May 2023
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
128
1,152
0
17 May 2023
Addressing Biases in the Texts using an End-to-End Pipeline Approach
Shaina Raza
Syed Raza Bashir
Sneha
Urooj Qamar
38
0
0
13 Mar 2023
Distributionally Robust Optimization with Probabilistic Group
Soumya Suvra Ghosal
Yixuan Li
OOD
16
7
0
10 Mar 2023
Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness
Zahra Ashktorab
Benjamin Hoover
Mayank Agarwal
Casey Dugan
Werner Geyer
Han Yang
Mikhail Yurochkin
FaML
43
17
0
01 Mar 2023
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets
Irina Bejan
Artem Sokolov
Katja Filippova
TDI
32
9
0
27 Feb 2023
Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection
Soumyajit Gupta
Sooyong Lee
Maria De-Arteaga
Matthew Lease
27
13
0
14 Feb 2023
Towards Agile Text Classifiers for Everyone
Maximilian Mozes
Jessica Hoffmann
Katrin Tomanek
Muhamed Kouate
Nithum Thain
Ann Yuan
Tolga Bolukbasi
Lucas Dixon
52
13
0
13 Feb 2023
A benchmark for toxic comment classification on Civil Comments dataset
Corentin Duchene
Henri Jamet
Pierre Guillaume
Reda Dehak
41
8
0
26 Jan 2023
ViHOS: Hate Speech Spans Detection for Vietnamese
Phu Gia Hoang
Canh Duc Luu
K. Tran
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
31
20
0
24 Jan 2023
Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
P. Sattigeri
S. Ghosh
Inkit Padhi
Pierre Dognin
Kush R. Varshney
FaML
27
28
0
13 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
77
443
0
08 Dec 2022
Addressing Distribution Shift at Test Time in Pre-trained Language Models
Ayush Singh
J. Ortega
VLM
29
4
0
05 Dec 2022
SOLD: Sinhala Offensive Language Dataset
Tharindu Ranasinghe
Isuri Anuradha
Damith Premasiri
Kanishka Silva
Hansi Hettiarachchi
Lasitha Uyangodage
Marcos Zampieri
41
8
0
01 Dec 2022
A Fair Loss Function for Network Pruning
Robbie Meyer
Alexander Wong
CVBM
27
3
0
18 Nov 2022
Striving for data-model efficiency: Identifying data externalities on group performance
Esther Rolf
Ben Packer
Alex Beutel
Fernando Diaz
TDI
30
2
0
11 Nov 2022
Okapi: Generalising Better by Making Statistical Matches Match
Myles Bartlett
Sara Romiti
V. Sharmanska
Novi Quadrianto
45
3
0
07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
34
13
0
01 Nov 2022
Nearest Neighbor Language Models for Stylistic Controllable Generation
Severino Trotta
Lucie Flek
Charles F Welch
31
4
0
27 Oct 2022
Sufficient Invariant Learning for Distribution Shift
Taero Kim
Sungjun Lim
Kyungwoo Song
OOD
38
2
0
24 Oct 2022
1
2
3
Next