Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes

3 February 2024

Md Mehrab Tanjim

Sungchul Kim

Papers citing "Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes"

20 / 20 papers shown

Title
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models Zhiting Fan Ruizhe Chen Zuozhu Liu 44 0 0 30 Apr 2025
A Survey of Foundation Model-Powered Recommender Systems: From Feature-Based, Generative to Agentic Paradigms Chengkai Huang Hongtao Huang Tong Yu Kaige Xie Junda Wu Shuai Zhang Julian McAuley Dietmar Jannach Lina Yao LRM AI4CE 22 0 0 23 Apr 2025
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering Y. Li Zhiting Fan Ruizhe Chen Xiaotang Gai Luqi Gong Yan Zhang Zuozhu Liu LLMSV 27 1 0 20 Apr 2025
DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models Suyoung Bae YunSeok Choi Jee-Hyong Lee 37 0 0 25 Mar 2025
Rethinking Prompt-based Debiasing in Large Language Models Xinyi Yang Runzhe Zhan Derek F. Wong Shu Yang Junchao Wu Lidia S. Chao ALM 57 1 0 12 Mar 2025
More of the Same: Persistent Representational Harms Under Increased Representation Jennifer Mickel Maria De-Arteaga Leqi Liu Kevin Tian 34 0 0 01 Mar 2025
Beneath the Surface: How Large Language Models Reflect Hidden Bias Jinhao Pan Chahat Raj Ziyu Yao Ziwei Zhu 41 0 0 27 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs Angelina Wang Michelle Phan Daniel E. Ho Sanmi Koyejo 43 2 0 04 Feb 2025
Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach Changgeon Ko Jisu Shin Hoyun Song Jeongyeon Seo Jong C. Park 59 0 0 26 Nov 2024
Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data Atmika Gorti Manas Gaur Aman Chadha 20 2 0 20 Aug 2024
REFINE-LM: Mitigating Language Model Stereotypes via Reinforcement Learning Rameez Qureshi Naim Es-Sebbani Luis Galárraga Yvette Graham Miguel Couceiro Zied Bouraoui 21 1 0 18 Aug 2024
BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization Ahmed Allam 38 8 0 18 Jul 2024
An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models Jayanta Sadhu Maneesha Rani Saha Rifat Shahriyar 25 0 0 08 Jul 2024
Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias Jayanta Sadhu Maneesha Rani Saha Rifat Shahriyar 25 3 0 03 Jul 2024
MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs Vera Neplenbroek Arianna Bisazza Raquel Fernández 29 6 0 11 Jun 2024
Large Language Models are Zero-Shot Reasoners Takeshi Kojima S. Gu Machel Reid Yutaka Matsuo Yusuke Iwasawa ReLM LRM 291 2,712 0 24 May 2022
BBQ: A Hand-Built Bias Benchmark for Question Answering Alicia Parrish Angelica Chen Nikita Nangia Vishakh Padmakumar Jason Phang Jana Thompson Phu Mon Htut Sam Bowman 210 364 0 15 Oct 2021
Generating Gender Augmented Data for NLP N. Jain Maja Popovic Declan Groves Eva Vanmassenhove 26 15 0 13 Jul 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick Sahana Udupa Hinrich Schütze 257 374 0 28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings Masahiro Kaneko Danushka Bollegala 210 138 0 23 Jan 2021