Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks

6 October 2022

Papers citing "Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks"

39 / 39 papers shown

Title
Evaluating the Effect of Retrieval Augmentation on Social Biases Tianhui Zhang Yi Zhou Danushka Bollegala 38 0 0 24 Feb 2025
Smaller Large Language Models Can Do Moral Self-Correction Guangliang Liu Zhiyu Xue Rongrong Wang K. Johnson Kristen Marie Johnson LRM 32 0 0 30 Oct 2024
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs Zhiting Fan Ruizhe Chen Ruiling Xu Zuozhu Liu KELM 21 16 0 14 Jul 2024
Social Bias Evaluation for Large Language Models Requires Prompt Variations Rem Hida Masahiro Kaneko Naoaki Okazaki 38 14 0 03 Jul 2024
A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers Valentin Barriere Sebastian Cifuentes 28 0 0 01 Jul 2024
Why Don't Prompt-Based Fairness Metrics Correlate? A. Zayed Gonçalo Mordido Ioana Baldini Sarath Chandar ALM 47 4 0 09 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness Guangliang Liu Milad Afshari Xitong Zhang Zhiyu Xue Avrajit Ghosh Bidhan Bashyal Rongrong Wang K. Johnson 27 0 0 06 Jun 2024
Anna Karenina Strikes Again: Pre-Trained LLM Embeddings May Favor High-Performing Learners Abigail Gurin Schleifer Beata Beigman Klebanov Moriah Ariely Giora Alexandron 30 2 0 06 Jun 2024
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept Guangliang Liu Haitao Mao Bochuan Cao Zhiyu Xue K. Johnson Jiliang Tang Rongrong Wang LRM 34 9 0 04 Jun 2024
Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models Paula Akemi Aoyagui Sharon Ferguson Anastasia Kuzminykh 50 0 0 17 May 2024
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps Giuseppe Attanasio Beatrice Savoldi Dennis Fucci Dirk Hovy 33 4 0 28 Feb 2024
Eagle: Ethical Dataset Given from Real Interactions Masahiro Kaneko Danushka Bollegala Timothy Baldwin 42 3 0 22 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation Kristian Lum Jacy Reese Anthis Chirag Nagpal Alex DÁmour Alexander D’Amour 31 13 0 20 Feb 2024
A Note on Bias to Complete Jia Xu Mona Diab 47 2 0 18 Feb 2024
Semantic Properties of cosine based bias scores for word embeddings Sarah Schröder Alexander Schulz Fabian Hinder Barbara Hammer 29 1 0 27 Jan 2024
The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing Masahiro Kaneko Danushka Bollegala Timothy Baldwin 30 3 0 16 Jan 2024
Understanding the Effect of Model Compression on Social Bias in Large Language Models Gustavo Gonçalves Emma Strubell 18 9 0 09 Dec 2023
General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level Bingkang Shi Xiaodan Zhang Dehan Kong Yulei Wu Zongzhen Liu Honglei Lyu Longtao Huang AI4CE 25 2 0 23 Nov 2023
Fair Text Classification with Wasserstein Independence Thibaud Leteno Antoine Gourru Charlotte Laclau Rémi Emonet Christophe Gravier FaML 26 2 0 21 Nov 2023
Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion? Yusuke Sakai Hidetaka Kamigaito Katsuhiko Hayashi Taro Watanabe 26 1 0 15 Nov 2023
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models Carlos Alejandro Aguirre Kuleen Sasse Isabel Cachola Mark Dredze 30 1 0 14 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models Laura Cabello Emanuele Bugliarello Stephanie Brandl Desmond Elliott 23 7 0 26 Oct 2023
A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models Yi Zhou Jose Camacho-Collados Danushka Bollegala 81 6 0 19 Oct 2023
Co $^2$ PT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning Xiangjue Dong Ziwei Zhu Zhuoer Wang Maria Teleki James Caverlee 39 11 0 19 Oct 2023
Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels Panatchakorn Anantaprayoon Masahiro Kaneko Naoaki Okazaki 64 16 0 18 Sep 2023
In-Contextual Gender Bias Suppression for Large Language Models Daisuke Oba Masahiro Kaneko Danushka Bollegala 31 8 0 13 Sep 2023
Bias and Fairness in Large Language Models: A Survey Isabel O. Gallegos Ryan A. Rossi Joe Barrow Md Mehrab Tanjim Sungchul Kim Franck Dernoncourt Tong Yu Ruiyi Zhang Nesreen Ahmed AILaw 21 490 0 02 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection Fatma Elsafoury 27 3 0 31 Aug 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection Fatma Elsafoury Stamos Katsigiannis 30 1 0 22 May 2023
Word Embeddings Are Steers for Language Models Chi Han Jialiang Xu Manling Li Yi Ren Fung Chenkai Sun Nan Jiang Tarek F. Abdelzaher Heng Ji LLMSV 26 27 0 22 May 2023
Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach Masahiro Kaneko Graham Neubig Naoaki Okazaki 36 6 0 19 May 2023
On the Origins of Bias in NLP through the Lens of the Jim Code Fatma Elsafoury Gavin Abercrombie 41 4 0 16 May 2023
On the Independence of Association Bias and Empirical Fairness in Language Models Laura Cabello Anna Katrine van Zee Anders Søgaard 26 25 0 20 Apr 2023
Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 24 9 0 28 Jan 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 29 209 0 16 Jan 2023
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models Kenan Tang Hanchun Jiang AI4CE 18 1 0 26 Nov 2022
MABEL: Attenuating Gender Bias using Textual Entailment Data Jacqueline He Mengzhou Xia C. Fellbaum Danqi Chen 21 32 0 26 Oct 2022
Gender Bias in Meta-Embeddings Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 36 6 0 19 May 2022
Debiasing Pre-trained Contextualised Embeddings Masahiro Kaneko Danushka Bollegala 218 138 0 23 Jan 2021