Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.00133
Cited By
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
30 September 2020
Nikita Nangia
Clara Vania
Rasika Bhalerao
Samuel R. Bowman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models"
50 / 125 papers shown
Title
Bridging AI and Carbon Capture: A Dataset for LLMs in Ionic Liquids and CBE Research
Gaurab Sarkar
Sougata Saha
25
0
0
11 May 2025
Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text
Jennifer Healey
Laurie Byrum
Md Nadeem Akhtar
Surabhi Bhargava
Moumita Sinha
29
0
0
05 May 2025
Emotions in the Loop: A Survey of Affective Computing for Emotional Support
Karishma Hegde
Hemadri Jayalath
27
0
0
02 May 2025
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Zhiting Fan
Ruizhe Chen
Zuozhu Liu
44
0
0
30 Apr 2025
Gender and content bias in Large Language Models: a case study on Google Gemini 2.0 Flash Experimental
Roberto Balestri
42
0
0
18 Mar 2025
Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models
Kefan Song
Jin Yao
Runnan Jiang
Rohan Chandra
Shangtong Zhang
ALM
46
0
0
10 Mar 2025
Gender Encoding Patterns in Pretrained Language Model Representations
Mahdi Zakizadeh
Mohammad Taher Pilehvar
48
0
0
09 Mar 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
103
0
0
17 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang
Michelle Phan
Daniel E. Ho
Sanmi Koyejo
54
2
0
04 Feb 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
43
1
0
24 Jan 2025
Characterization of GPU TEE Overheads in Distributed Data Parallel ML Training
Jonghytun Lee
Yongqin Wang
Rachit Rajat
M. Annavaram
53
0
0
20 Jan 2025
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
Junho Myung
Nayeon Lee
Yi Zhou
Jiho Jin
Rifki Afina Putri
...
Seid Muhie Yimam
Mohammad Taher Pilehvar
N. Ousidhoum
Jose Camacho-Collados
Alice H. Oh
92
34
0
17 Jan 2025
Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring
Buse Sibel Korkmaz
Rahul Nair
Elizabeth M. Daly
Evangelos Anagnostopoulos
Christos Varytimidis
Antonio del Rio Chanona
40
0
0
13 Jan 2025
ChineseSafe: A Chinese Benchmark for Evaluating Safety in Large Language Models
H. Zhang
Hongfu Gao
Qiang Hu
Guanhua Chen
L. Yang
Bingyi Jing
Hongxin Wei
Bing Wang
Haifeng Bai
Lei Yang
AILaw
ELM
49
2
0
24 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection
Mengdi Zhang
Kai Kiat Goh
Peixin Zhang
Jun Sun
Rose Lin Xin
Hongyu Zhang
23
0
0
22 Oct 2024
Bias Similarity Across Large Language Models
Hyejun Jeong
Shiqing Ma
Amir Houmansadr
51
0
0
15 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
41
0
0
12 Oct 2024
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu
Hongyi Wu
Zihan Guan
Ronghang Zhu
Dongliang Guo
Daiqing Qi
Sheng Li
SILM
35
3
0
10 Oct 2024
AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances
Dhruv Agarwal
Mor Naaman
Aditya Vashistha
36
16
0
17 Sep 2024
Acceptable Use Policies for Foundation Models
Kevin Klyman
31
14
0
29 Aug 2024
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Wenxuan Zhang
Philip H. S. Torr
Mohamed Elhoseiny
Adel Bibi
83
9
0
27 Aug 2024
From 'Showgirls' to 'Performers': Fine-tuning with Gender-inclusive Language for Bias Reduction in LLMs
Marion Bartl
Susan Leavy
40
8
0
05 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
56
7
0
02 Jul 2024
Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?
Haozhe An
Christabel Acquaye
Colin Wang
Zongxia Li
Rachel Rudinger
36
12
0
15 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
38
0
06 Jun 2024
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
Jisu Shin
Hoyun Song
Huije Lee
Soyeong Jeong
Jong C. Park
38
6
0
06 Jun 2024
Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals
Phillip Howard
Kathleen C. Fraser
Anahita Bhiwandiwalla
S. Kiritchenko
50
9
0
30 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
41
37
0
28 May 2024
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction
Virginia K. Felkner
Jennifer A. Thompson
Jonathan May
41
9
0
24 May 2024
Quite Good, but Not Enough: Nationality Bias in Large Language Models -- A Case Study of ChatGPT
Shucheng Zhu
Weikang Wang
Ying Liu
34
5
0
11 May 2024
Are Models Biased on Text without Gender-related Language?
Catarina G Belém
P. Seshadri
Yasaman Razeghi
Sameer Singh
36
8
0
01 May 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
58
31
0
08 Apr 2024
Specification Overfitting in Artificial Intelligence
Benjamin Roth
Pedro Henrique Luz de Araujo
Yuxi Xia
Saskia Kaltenbrunner
Christoph Korab
58
0
0
13 Mar 2024
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
Flor Miriam Plaza del Arco
A. C. Curry
Alba Curry
Gavin Abercrombie
Dirk Hovy
31
23
0
05 Mar 2024
LLM-Assisted Content Conditional Debiasing for Fair Text Embedding
Wenlong Deng
Blair Chen
Beidi Zhao
Chiyu Zhang
Xiaoxiao Li
Christos Thrampoulidis
35
0
0
22 Feb 2024
COBIAS: Assessing the Contextual Reliability of Bias Benchmarks for Language Models
Priyanshul Govil
Hemang Jain
Vamshi Krishna Bonagiri
Aman Chadha
Ponnurangam Kumaraguru
Manas Gaur
Sanorita Dey
50
2
0
22 Feb 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
29
1
0
21 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
31
13
0
20 Feb 2024
Explaining Probabilistic Models with Distributional Values
Luca Franceschi
Michele Donini
Cédric Archambeau
Matthias Seeger
FAtt
37
2
0
15 Feb 2024
CroissantLLM: A Truly Bilingual French-English Language Model
Manuel Faysse
Patrick Fernandes
Nuno M. Guerreiro
António Loison
Duarte M. Alves
...
François Yvon
André F.T. Martins
Gautier Viaud
C´eline Hudelot
Pierre Colombo
52
32
0
01 Feb 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
34
27
0
28 Jan 2024
Quantifying Stereotypes in Language
Yang Liu
35
1
0
28 Jan 2024
Multilingual large language models leak human stereotypes across language boundaries
Yang Trista Cao
Anna Sotnikova
Jieyu Zhao
Linda X. Zou
Rachel Rudinger
Hal Daumé
PILM
28
10
0
12 Dec 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
28
16
0
19 Oct 2023
Emerging Challenges in Personalized Medicine: Assessing Demographic Effects on Biomedical Question Answering Systems
Sagi Shaier
Kevin Bennett
Lawrence E Hunter
K. Wense
29
0
0
16 Oct 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MA
ELM
AI4MH
37
66
0
21 Sep 2023
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Nolan Dey
Daria Soboleva
Faisal Al-Khateeb
Bowen Yang
Ribhu Pathria
...
Robert Myers
Jacob Robert Steeves
Natalia Vassilieva
Marvin Tom
Joel Hestness
MoE
21
14
0
20 Sep 2023
OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs
Patrick Haller
Ansar Aynetdinov
A. Akbik
33
24
0
07 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
Fatma Elsafoury
27
3
0
31 Aug 2023
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions
Reem I. Masoud
Ziquan Liu
Martin Ferianc
Philip C. Treleaven
Miguel R. D. Rodrigues
24
50
0
25 Aug 2023
1
2
3
Next