On Measuring Social Biases in Sentence Encoders

25 March 2019

Papers citing "On Measuring Social Biases in Sentence Encoders"

50 / 367 papers shown

Title
Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs Elisa Forcada Rodríguez Olatz Perez-de-Viñaspre Jon Ander Campos Dietrich Klakow Vagrant Gautam 27 0 0 05 May 2025
Do Large Language Models know who did what to whom? Joseph M. Denning Xiaohan Bryor Snefjella Idan A. Blank 50 1 0 23 Apr 2025
Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models Maria Teleki Xiangjue Dong Haoran Liu James Caverlee 30 0 0 15 Apr 2025
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge Riccardo Cantini A. Orsino Massimo Ruggiero Domenico Talia AAML ELM 40 0 0 10 Apr 2025
GraphSeg: Segmented 3D Representations via Graph Edge Addition and Contraction Haozhan Tang Tianyi Zhang Oliver Kroemer Matthew Johnson-Roberson Weiming Zhi 3DPC 57 0 0 04 Apr 2025
The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data Massimiliano Luca Ciro Beneduce Bruno Lepri Jacopo Staiano 45 0 0 02 Apr 2025
Model Risk Management for Generative AI In Financial Institutions Anwesha Bhattacharyya Ye Yu Hanyu Yang Rahul Singh Tarun Joshi Jie Chen Kiran Yalavarthy AIFin MedIm 44 0 0 19 Mar 2025
Gender and content bias in Large Language Models: a case study on Google Gemini 2.0 Flash Experimental Roberto Balestri 42 0 0 18 Mar 2025
An Evaluation of LLMs for Detecting Harmful Computing Terms Joshua Jacas Hana Winchester Alicia Boyd Brittany Johnson 56 0 0 12 Mar 2025
On the Mutual Influence of Gender and Occupation in LLM Representations Haozhe An Connor Baumler Abhilasha Sancheti Rachel Rudinger AI4CE 53 0 0 09 Mar 2025
Gender Encoding Patterns in Pretrained Language Model Representations Mahdi Zakizadeh Mohammad Taher Pilehvar 43 0 0 09 Mar 2025
Visual Cues of Gender and Race are Associated with Stereotyping in Vision-Language Models Messi H.J. Lee Soyeon Jeon Jacob M. Montgomery Calvin K Lai VLM CoGe 74 0 0 07 Mar 2025
Implicit Bias in LLMs: A Survey Xinru Lin Luyang Li 57 0 0 04 Mar 2025
Rethinking LLM Bias Probing Using Lessons from the Social Sciences Kirsten N. Morehouse S. Swaroop Weiwei Pan 43 0 0 28 Feb 2025
The Impact of Inference Acceleration on Bias of LLMs Elisabeth Kirsten Ivan Habernal Vedant Nanda Muhammad Bilal Zafar 36 0 0 20 Feb 2025
Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks Fatemeh Mohammadi Marta Annamaria Tamborini Paolo Ceravolo Costanza Nardocci S. Maghool 63 0 0 17 Feb 2025
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models Yue Xu Chengyan Fu Li Xiong Sibei Yang Wenjie Wang 42 0 0 17 Feb 2025
Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders Kshitish Ghate Isaac Slaughter Kyra Wilson Mona Diab Aylin Caliskan 76 0 0 11 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs Angelina Wang Michelle Phan Daniel E. Ho Sanmi Koyejo 43 2 0 04 Feb 2025
Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated Sentences Liu Yu Ludie Guo Ping Kuang Fan Zhou 34 0 0 12 Jan 2025
MAFT: Efficient Model-Agnostic Fairness Testing for Deep Neural Networks via Zero-Order Gradient Search Zhaohui Wang Min Zhang Jingran Yang Bojie Shao Min Zhang 41 4 0 31 Dec 2024
Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach Daiki Shirafuji Makoto Takenaka Shinya Taguchi LLMAG 67 0 0 16 Dec 2024
Improving LLM Group Fairness on Tabular Data via In-Context Learning Valeriia Cherepanova Chia-Jung Lee Nil-Jana Akpinar Riccardo Fogliato Martín Bertrán Michael Kearns James Zou LMTD 63 0 0 05 Dec 2024
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment Feng He Chao Zhang Zhixue Zhao 71 0 0 04 Dec 2024
How far can bias go? -- Tracing bias from pretraining data to alignment Marion Thaler Abdullatif Köksal Alina Leidinger Anna Korhonen Hinrich Schutze 69 0 0 28 Nov 2024
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings Carolin M. Schuster Maria-Alexandra Dinisor Shashwat Ghatiwala Georg Groh 70 1 0 25 Nov 2024
Joint Vision-Language Social Bias Removal for CLIP Haoyu Zhang Yangyang Guo Mohan S. Kankanhalli VLM 67 0 0 19 Nov 2024
Mitigating Gender Bias in Contextual Word Embeddings Navya Yarrabelly Vinay Damodaran Feng-Guang Su 62 0 0 18 Nov 2024
Bias in Large Language Models: Origin, Evaluation, and Mitigation Yufei Guo Muzhe Guo Juntao Su Zhou Yang Mengqiu Zhu Hongfei Li Mengyang Qiu Shuo Shuo Liu AILaw 23 8 0 16 Nov 2024
Identifying Implicit Social Biases in Vision-Language Models Kimia Hamidieh Haoran Zhang Walter Gerych Thomas Hartvigsen Marzyeh Ghassemi VLM 28 11 0 01 Nov 2024
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs Zhiting Fan Ruizhe Chen Tianxiang Hu Zuozhu Liu 21 7 0 25 Oct 2024
Large Language Models Still Exhibit Bias in Long Text Wonje Jeung Dongjae Jeon Ashkan Yousefpour Jonghyun Choi ALM 29 2 0 23 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection Mengdi Zhang Kai Kiat Goh Peixin Zhang Jun Sun 18 0 0 22 Oct 2024
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models Eddie L. Ungless Nikolas Vitsakis Zeerak Talat James Garforth Bjorn Ross Arno Onken Atoosa Kasirzadeh Alexandra Birch 28 1 0 17 Oct 2024
Evaluating Gender Bias of LLMs in Making Morality Judgements Divij Bajaj Yuanyuan Lei Jonathan Tong Ruihong Huang 35 2 0 13 Oct 2024
Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs Divyanshu Kumar Umang Jain Sahil Agarwal P. Harshangi 23 4 0 13 Oct 2024
Collapsed Language Models Promote Fairness Jingxuan Xu Wuyang Chen Linyi Li Yao Zhao Yunchao Wei 39 0 0 06 Oct 2024
Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM Interactions Angana Borah Rada Mihalcea 30 7 0 03 Oct 2024
Racing Thoughts: Explaining Contextualization Errors in Large Language Models Michael A. Lepori Michael Mozer Asma Ghandeharioun LRM 80 1 0 02 Oct 2024
REFINE-LM: Mitigating Language Model Stereotypes via Reinforcement Learning Rameez Qureshi Naim Es-Sebbani Luis Galárraga Yvette Graham Miguel Couceiro Zied Bouraoui 26 1 0 18 Aug 2024
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models Yi-Cheng Lin Wei-Chih Chen Hung-yi Lee 38 1 0 14 Aug 2024
ML-EAT: A Multilevel Embedding Association Test for Interpretable and Transparent Social Science Robert Wolfe Alexis Hiniker Bill Howe 35 0 0 04 Aug 2024
Downstream bias mitigation is all you need Arkadeep Baksi Rahul Singh Tarun Joshi AI4CE 22 0 0 01 Aug 2024
The BIAS Detection Framework: Bias Detection in Word Embeddings and Language Models for European Languages A. Puttick Leander Rankwiler Catherine Ikae Mascha Kurpicz-Briki 16 0 0 26 Jul 2024
Fairness Definitions in Language Models Explained Thang Viet Doan Zhibo Chu Zichong Wang Wenbin Zhang ALM 50 10 0 26 Jul 2024
Understanding the Interplay of Scale, Data, and Bias in Language Models: A Case Study with BERT Muhammad Ali Swetasudha Panda Qinlan Shen Michael Wick Ari Kobren MILM 27 3 0 25 Jul 2024
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs Zhiting Fan Ruizhe Chen Ruiling Xu Zuozhu Liu KELM 16 15 0 14 Jul 2024
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models Zara Siddique Liam D. Turner Luis Espinosa-Anke 29 0 0 09 Jul 2024
An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models Jayanta Sadhu Maneesha Rani Saha Rifat Shahriyar 27 0 0 08 Jul 2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models Xavier Suau Pieter Delobelle Katherine Metcalf Armand Joulin N. Apostoloff Luca Zappella P. Rodríguez MU AAML 32 8 0 02 Jul 2024