ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.03471
  4. Cited By
Template-Based Probes Are Imperfect Lenses for Counterfactual Bias Evaluation in LLMs
v1v2v3v4v5 (latest)

Template-Based Probes Are Imperfect Lenses for Counterfactual Bias Evaluation in LLMs

4 April 2024
Farnaz Kohankhaki
D. B. Emerson
David B. Emerson
Laleh Seyyed-Kalantari
Faiza Khan Khattak
ArXiv (abs)PDFHTMLGithub (1★)

Papers citing "Template-Based Probes Are Imperfect Lenses for Counterfactual Bias Evaluation in LLMs"

35 / 35 papers shown
Large Language Models are Geographically Biased
Large Language Models are Geographically Biased
Rohin Manvi
Samar Khanna
Marshall Burke
David B. Lobell
Stefano Ermon
443
102
0
05 Feb 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought
  Prompting
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
313
58
0
28 Jan 2024
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
  LLM-Generated Reference Letters
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters
Yixin Wan
George Pu
Jiao Sun
Aparna Garimella
Kai-Wei Chang
Nanyun Peng
583
305
0
13 Oct 2023
Bias and Fairness in Large Language Models: A Survey
Bias and Fairness in Large Language Models: A SurveyComputational Linguistics (CL), 2023
Isabel O. Gallegos
Ryan Rossi
Joe Barrow
Md Mehrab Tanjim
Sungchul Kim
Franck Dernoncourt
Tong Yu
Ruiyi Zhang
Nesreen Ahmed
AILaw
476
1,011
0
02 Sep 2023
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language
  Models' Alignment
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Yang Liu
Yuanshun Yao
Jean-François Ton
Xiaoying Zhang
Ruocheng Guo
Hao Cheng
Yegor Klochkov
Muhammad Faaiz Taufiq
Hanguang Li
ALM
480
520
0
10 Aug 2023
Can Instruction Fine-Tuned Language Models Identify Social Bias through
  Prompting?
Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?
O. Dige
Jacob-Junqi Tian
David B. Emerson
Faiza Khan Khattak
ALM
172
8
0
19 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
12.3K
16,310
0
18 Jul 2023
Soft-prompt Tuning for Large Language Models to Evaluate Bias
Soft-prompt Tuning for Large Language Models to Evaluate Bias
Jacob-Junqi Tian
David B. Emerson
Sevil Zanjani Miyandoab
D. Pandya
Laleh Seyyed-Kalantari
Faiza Khan Khattak
VLM
285
12
0
07 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward ModelNeural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
1.1K
7,889
0
29 May 2023
Marked Personas: Using Natural Language Prompts to Measure Stereotypes
  in Language Models
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Myra Cheng
Esin Durmus
Dan Jurafsky
327
300
0
29 May 2023
Comparing Biases and the Impact of Multilingual Training across Multiple
  Languages
Comparing Biases and the Impact of Multilingual Training across Multiple LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Sharon Levy
Neha Ann John
Ling Liu
Yogarshi Vyas
Jie Ma
Yoshinari Fujinuma
Miguel Ballesteros
Vittorio Castelli
Dan Roth
258
42
0
18 May 2023
Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining
  on Visual Language Understanding
Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023
Morris Alper
Michael Fiman
Hadar Averbuch-Elor
VLMLRM
319
18
0
21 Mar 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approachAI and Ethics (AE), 2023
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILawMLAU
561
293
0
16 Feb 2023
The Capacity for Moral Self-Correction in Large Language Models
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRMReLM
377
201
0
15 Feb 2023
Do ever larger octopi still amplify reporting biases? Evidence from
  judgments of typical colour
Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour
Fangyu Liu
Julian Martin Eisenschlos
Jeremy R. Cole
Nigel Collier
296
5
0
26 Sep 2022
VIPHY: Probing "Visible" Physical Commonsense Knowledge
VIPHY: Probing "Visible" Physical Commonsense KnowledgeConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Shikhar Singh
Ehsan Qasemi
Muhao Chen
323
6
0
15 Sep 2022
American == White in Multimodal Language-and-Image AI
American == White in Multimodal Language-and-Image AIAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2022
Robert Wolfe
Aylin Caliskan
VLM
280
58
0
01 Jul 2022
What do Models Learn From Training on More Than Text? Measuring Visual
  Commonsense Knowledge
What do Models Learn From Training on More Than Text? Measuring Visual Commonsense KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Lovisa Hagström
Richard Johansson
VLM
236
3
0
14 May 2022
Using Natural Sentences for Understanding Biases in Language Models
Using Natural Sentences for Understanding Biases in Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Sarah Alnegheimish
Alicia Guo
Yi Sun
141
26
0
12 May 2022
Visual Commonsense in Pretrained Unimodal and Multimodal Models
Visual Commonsense in Pretrained Unimodal and Multimodal ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Chenyu Zhang
Benjamin Van Durme
Zhuowan Li
Elias Stengel-Eskin
VLMSSL
262
44
0
04 May 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
1.1K
4,614
0
02 May 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
1.2K
3,811
0
12 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
2.7K
16,812
0
28 Jan 2022
Scaling Language Models: Methods, Analysis & Insights from Training
  Gopher
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Jack W. Rae
Sebastian Borgeaud
Trevor Cai
Katie Millican
Jordan Hoffmann
...
Jeff Stanway
L. Bennett
Demis Hassabis
Koray Kavukcuoglu
G. Irving
613
1,572
0
08 Dec 2021
The World of an Octopus: How Reporting Bias Influences a Language
  Model's Perception of Color
The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color
Cory Paik
Stéphane Aroca-Ouellette
Alessandro Roncone
Katharina Kann
217
39
0
15 Oct 2021
Transferring Knowledge from Vision to Language: How to Achieve it and
  how to Measure it?
Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2021
Tobias Norlund
Lovisa Hagström
Richard Johansson
307
26
0
23 Sep 2021
Quantifying Social Biases in NLP: A Generalization and Empirical
  Comparison of Extrinsic Fairness Metrics
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness MetricsTransactions of the Association for Computational Linguistics (TACL), 2021
Paula Czarnowska
Yogarshi Vyas
Kashif Shah
250
135
0
28 Jun 2021
Towards Understanding and Mitigating Social Biases in Language Models
Towards Understanding and Mitigating Social Biases in Language Models
Paul Pu Liang
Chiyu Wu
Louis-Philippe Morency
Ruslan Salakhutdinov
434
487
0
24 Jun 2021
Persistent Anti-Muslim Bias in Large Language Models
Persistent Anti-Muslim Bias in Large Language ModelsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2021
Abubakar Abid
Maheen Farooqi
James Zou
AILaw
520
678
0
14 Jan 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.4K
56,453
0
28 May 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
640
1,317
0
08 May 2020
CheXclusion: Fairness gaps in deep chest X-ray classifiers
CheXclusion: Fairness gaps in deep chest X-ray classifiersPacific Symposium on Biocomputing (PSB), 2020
Laleh Seyyed-Kalantari
Guanxiong Liu
Matthew B. A. McDermott
Irene Y. Chen
Marzyeh Ghassemi
OOD
414
361
0
14 Feb 2020
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
814
794
0
03 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
6.0K
28,988
0
26 Jul 2019
Bias in Bios: A Case Study of Semantic Representation Bias in a
  High-Stakes Setting
Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting
Maria De-Arteaga
Alexey Romanov
Hanna M. Wallach
J. Chayes
C. Borgs
Alexandra Chouldechova
S. Geyik
K. Kenthapadi
Adam Tauman Kalai
635
545
0
27 Jan 2019
1
Page 1 of 1