Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.18812
Cited By
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations
30 November 2023
Raphael Tang
Xinyu Crystina Zhang
Jimmy J. Lin
Ferhan Ture
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations"
5 / 5 papers shown
Title
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation
Xiangjue Dong
Yibo Wang
Philip S. Yu
James Caverlee
24
25
0
01 Nov 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Assessing the Reliability of Word Embedding Gender Bias Measures
Yupei Du
Qixiang Fang
D. Nguyen
27
21
0
10 Sep 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
219
291
0
24 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1