Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification

11 March 2019

Daniel Borkan

Lucas Dixon

Jeffrey Scott Sorensen

Nithum Thain

Lucy Vasserman

ArXiv PDF HTML

Papers citing "Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification"

50 / 125 papers shown

Title
Enforcing Fairness Where It Matters: An Approach Based on Difference-of-Convex Constraints Yutian He Yankun Huang Yao Yao Qihang Lin FaML 27 0 0 18 May 2025
Fine-Grained Bias Exploration and Mitigation for Group-Robust Classification Miaoyun Zhao Qiang Zhang C. Li 36 0 0 11 May 2025
Teaching Models to Understand (but not Generate) High-risk Data Ryan Yixiang Wang Matthew Finlayson Luca Soldaini Swabha Swayamdipta Robin Jia 183 0 0 05 May 2025
Validating LLM-as-a-Judge Systems in the Absence of Gold Labels Luke M. Guerdan Solon Barocas Kenneth Holstein Hanna M. Wallach Zhiwei Steven Wu Alexandra Chouldechova ALM ELM 302 0 0 13 Mar 2025
Out-of-Distribution Detection using Synthetic Data Generation Momin Abbas Muneeza Azmat R. Horesh Mikhail Yurochkin 49 1 0 05 Feb 2025
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification Tom A. Lamb Adam Davies Alasdair Paren Philip Torr Francesco Pinto 54 0 0 30 Oct 2024
Compositional Risk Minimization Divyat Mahajan Mohammad Pezeshki Ioannis Mitliagkas Kartik Ahuja Pascal Vincent Pascal Vincent 31 3 0 08 Oct 2024
Identity-related Speech Suppression in Generative AI Content Moderation Oghenefejiro Isaacs Anigboro Charlie M. Crawford Danaë Metaxa Sorelle A. Friedler Sorelle A. Friedler 26 0 0 09 Sep 2024
Towards Generalized Offensive Language Identification A. Dmonte Tejas Arya Tharindu Ranasinghe Marcos Zampieri 52 3 0 26 Jul 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs S. Kadhe Farhan Ahmed Dennis Wei Nathalie Baracaldo Inkit Padhi MoMe MU 33 7 0 17 Jun 2024
Automated Program Repair: Emerging trends pose and expose problems for benchmarks J. Renzullo Pemma Reiter Westley Weimer Stephanie Forrest 47 1 0 08 May 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models Luiza Amador Pozzobon Patrick Lewis Sara Hooker Beyza Ermis 50 7 0 06 Mar 2024
Implicit Bias and Fast Convergence Rates for Self-attention Bhavya Vasudeva Puneesh Deora Christos Thrampoulidis 42 15 0 08 Feb 2024
Understanding Domain Generalization: A Noise Robustness Perspective Rui Qiao K. H. Low OOD 39 6 0 26 Jan 2024
Enhancing Robustness of Foundation Model Representations under Provenance-related Distribution Shifts Xiruo Ding Zhecheng Sheng Brian Hur Feng Chen Serguei V. S. Pakhomov Trevor Cohen OOD 23 0 0 09 Dec 2023
Using Early Readouts to Mediate Featural Bias in Distillation Rishabh Tiwari D. Sivasubramanian Anmol Reddy Mekala Ganesh Ramakrishnan Pradeep Shenoy 26 5 0 28 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching Nico Daheim Thomas Möllenhoff Edoardo Ponti Iryna Gurevych Mohammad Emtiyaz Khan MoMe FedML 37 45 0 19 Oct 2023
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models Luiza Amador Pozzobon Beyza Ermis Patrick Lewis Sara Hooker 36 20 0 11 Oct 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI Mahyar Abbasian Elahe Khatibi Iman Azimi David Oniani Zahra Shakeri Hossein Abad ... Bryant Lin Olivier Gevaert Li-Jia Li Ramesh C. Jain Amir M. Rahmani LM&MA ELM AI4MH 45 66 0 21 Sep 2023
Bias Amplification Enhances Minority Group Performance Gaotang Li Jiarui Liu Wei Hu 30 5 0 13 Sep 2023
Zero-Shot Robustification of Zero-Shot Models Dyah Adila Changho Shin Lin Cai Frederic Sala 48 19 0 08 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection Fatma Elsafoury 29 3 0 31 Aug 2023
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions Reem I. Masoud Ziquan Liu Martin Ferianc Philip C. Treleaven Miguel R. D. Rodrigues 27 50 0 25 Aug 2023
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation Xinshuo Hu Dongfang Li Baotian Hu Zihao Zheng Zhenyu Liu Hao Fei KELM MU 40 26 0 16 Aug 2023
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification K. Chernyshev E. Garanina Duygu Bayram Qiankun Zheng Lukas Edman 13 0 0 08 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations Lifan Yuan Yangyi Chen Yuchen Zhang Hongcheng Gao Fangyuan Zou Xingyi Cheng Heng Ji Zhiyuan Liu Maosong Sun 52 75 0 07 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation Carolina Zheng Claudia Shi Keyon Vafa Amir Feder David M. Blei OOD 38 8 0 31 May 2023
Analyzing Text Representations by Measuring Task Alignment César González-Gutiérrez Audi Primadhanty Francesco Cazzaro A. Quattoni 23 1 0 31 May 2023
Rectifying Group Irregularities in Explanations for Distribution Shift Adam Stein Yinjun Wu Eric Wong Mayur Naik 42 1 0 25 May 2023
Understanding and Mitigating Spurious Correlations in Text Classification with Neighborhood Analysis Oscar Chew Hsuan-Tien Lin Kai-Wei Chang Kuan-Hao Huang 40 5 0 23 May 2023
Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization Ting Wu Rui Zheng Tao Gui Qi Zhang Xuanjing Huang 51 2 0 20 May 2023
PaLM 2 Technical Report Rohan Anil Andrew M. Dai Orhan Firat Melvin Johnson Dmitry Lepikhin ... Ce Zheng Wei Zhou Denny Zhou Slav Petrov Yonghui Wu ReLM LRM 128 1,152 0 17 May 2023
Addressing Biases in the Texts using an End-to-End Pipeline Approach Shaina Raza Syed Raza Bashir Sneha Urooj Qamar 38 0 0 13 Mar 2023
Distributionally Robust Optimization with Probabilistic Group Soumya Suvra Ghosal Yixuan Li OOD 16 7 0 10 Mar 2023
Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness Zahra Ashktorab Benjamin Hoover Mayank Agarwal Casey Dugan Werner Geyer Han Yang Mikhail Yurochkin FaML 43 17 0 01 Mar 2023
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets Irina Bejan Artem Sokolov Katja Filippova TDI 32 9 0 27 Feb 2023
Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection Soumyajit Gupta Sooyong Lee Maria De-Arteaga Matthew Lease 27 13 0 14 Feb 2023
Towards Agile Text Classifiers for Everyone Maximilian Mozes Jessica Hoffmann Katrin Tomanek Muhamed Kouate Nithum Thain Ann Yuan Tolga Bolukbasi Lucas Dixon 52 13 0 13 Feb 2023
A benchmark for toxic comment classification on Civil Comments dataset Corentin Duchene Henri Jamet Pierre Guillaume Reda Dehak 41 8 0 26 Jan 2023
ViHOS: Hate Speech Spans Detection for Vietnamese Phu Gia Hoang Canh Duc Luu K. Tran Kiet Van Nguyen Ngan Luu-Thuy Nguyen 31 20 0 24 Jan 2023
Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting P. Sattigeri S. Ghosh Inkit Padhi Pierre Dognin Kush R. Varshney FaML 27 28 0 13 Dec 2022
Editing Models with Task Arithmetic Gabriel Ilharco Marco Tulio Ribeiro Mitchell Wortsman Suchin Gururangan Ludwig Schmidt Hannaneh Hajishirzi Ali Farhadi KELM MoMe MU 77 443 0 08 Dec 2022
Addressing Distribution Shift at Test Time in Pre-trained Language Models Ayush Singh J. Ortega VLM 29 4 0 05 Dec 2022
SOLD: Sinhala Offensive Language Dataset Tharindu Ranasinghe Isuri Anuradha Damith Premasiri Kanishka Silva Hansi Hettiarachchi Lasitha Uyangodage Marcos Zampieri 41 8 0 01 Dec 2022
A Fair Loss Function for Network Pruning Robbie Meyer Alexander Wong CVBM 27 3 0 18 Nov 2022
Striving for data-model efficiency: Identifying data externalities on group performance Esther Rolf Ben Packer Alex Beutel Fernando Diaz TDI 30 2 0 11 Nov 2022
Okapi: Generalising Better by Making Statistical Matches Match Myles Bartlett Sara Romiti V. Sharmanska Novi Quadrianto 45 3 0 07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection Jiyun Kim Byounghan Lee Kyung-ah Sohn 34 13 0 01 Nov 2022
Nearest Neighbor Language Models for Stylistic Controllable Generation Severino Trotta Lucie Flek Charles F Welch 31 4 0 27 Oct 2022
Sufficient Invariant Learning for Distribution Shift Taero Kim Sungjun Lim Kyungwoo Song OOD 38 2 0 24 Oct 2022