ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12649
  4. Cited By
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation

Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation

20 February 2024
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
ArXivPDFHTML

Papers citing "Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation"

13 / 13 papers shown
Title
Agree to Disagree? A Meta-Evaluation of LLM Misgendering
Agree to Disagree? A Meta-Evaluation of LLM Misgendering
Arjun Subramonian
Vagrant Gautam
Preethi Seshadri
Dietrich Klakow
Kai-Wei Chang
Yizhou Sun
27
1
0
23 Apr 2025
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge
Riccardo Cantini
A. Orsino
Massimo Ruggiero
Domenico Talia
AAML
ELM
40
0
0
10 Apr 2025
LLM Social Simulations Are a Promising Research Method
LLM Social Simulations Are a Promising Research Method
Jacy Reese Anthis
Ryan Liu
Sean M. Richardson
Austin C. Kozlowski
Bernard Koch
James A. Evans
Erik Brynjolfsson
Michael S. Bernstein
ALM
49
4
0
03 Apr 2025
Toward an Evaluation Science for Generative AI Systems
Laura Weidinger
Deb Raji
Hanna M. Wallach
Margaret Mitchell
Angelina Wang
Olawale Salaudeen
Rishi Bommasani
Sayash Kapoor
Deep Ganguli
Sanmi Koyejo
EGVM
ELM
62
3
0
07 Mar 2025
Do LLMs exhibit demographic parity in responses to queries about Human Rights?
Do LLMs exhibit demographic parity in responses to queries about Human Rights?
Rafiya Javed
Jackie Kay
David Yanni
Abdullah Zaini
Anushe Sheikh
Maribeth Rauh
Iason Gabriel
Laura Weidinger
51
0
0
26 Feb 2025
Towards Effective Discrimination Testing for Generative AI
Towards Effective Discrimination Testing for Generative AI
Thomas P. Zollo
Nikita Rajaneesh
Richard Zemel
Talia B. Gillis
Emily Black
30
1
0
31 Dec 2024
HateDay: Insights from a Global Hate Speech Dataset Representative of a
  Day on Twitter
HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter
Manuel Tonneau
Diyi Liu
Niyati Malhotra
Scott A. Hale
Samuel Fraiberger
Victor Orozco-Olvera
Paul Röttger
61
0
0
23 Nov 2024
Fairness Definitions in Language Models Explained
Fairness Definitions in Language Models Explained
Thang Viet Doan
Zhibo Chu
Zichong Wang
Wenbin Zhang
ALM
50
10
0
26 Jul 2024
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Riccardo Cantini
Giada Cosenza
A. Orsino
Domenico Talia
AAML
45
5
0
11 Jul 2024
The Impossibility of Fair LLMs
The Impossibility of Fair LLMs
Jacy Reese Anthis
Kristian Lum
Michael Ekstrand
Avi Feller
Alexander D’Amour
Chenhao Tan
FaML
29
10
0
28 May 2024
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models
  Exhibit Gender Performance Gaps
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps
Giuseppe Attanasio
Beatrice Savoldi
Dennis Fucci
Dirk Hovy
31
4
0
28 Feb 2024
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
215
1,701
0
07 Apr 2023
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1