Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.08441
Cited By
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
11 July 2024
Riccardo Cantini
Giada Cosenza
A. Orsino
Domenico Talia
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation"
7 / 7 papers shown
Title
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge
Riccardo Cantini
A. Orsino
Massimo Ruggiero
Domenico Talia
AAML
ELM
40
0
0
10 Apr 2025
Hey GPT, Can You be More Racist? Analysis from Crowdsourced Attempts to Elicit Biased Content from Generative AI
Hangzhi Guo
Pranav Narayanan Venkit
Eunchae Jang
Mukund Srinath
Wenbo Zhang
Bonam Mingole
Vipul Gupta
Kush R. Varshney
S. Shyam Sundar
A. Yadav
27
3
0
20 Oct 2024
User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
Xianzhe Fan
Qing Xiao
Xuhui Zhou
Jiaxin Pei
Maarten Sap
Zhicong Lu
Hong Shen
45
5
0
01 Sep 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
47
6
0
20 Jul 2024
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
197
2,232
0
22 Mar 2023
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
254
374
0
28 Feb 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
198
607
0
03 Sep 2019
1