Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.01026
Cited By
v1
v2 (latest)
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors
3 June 2024
Mengge Xue
Zhenyu Hu
Liqun Liu
Kuo Liao
Shuang Li
Honglin Han
Meng Zhao
Chengguo Yin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors"
14 / 14 papers shown
Title
Mitigating Selection Bias with Node Pruning and Auxiliary Options
Hyeong Kyu Choi
Weijie Xu
Chi Xue
Stephanie Eckman
Chandan K. Reddy
92
2
0
27 Sep 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
486
4,447
0
09 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
391
4,177
0
29 May 2023
Chain of Hindsight Aligns Language Models with Feedback
Hao Liu
Carmelo Sferrazza
Pieter Abbeel
ALM
139
124
0
06 Feb 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
430
2,396
0
09 Nov 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
540
6,305
0
05 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
907
13,240
0
04 Mar 2022
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Kimin Lee
Laura M. Smith
Pieter Abbeel
OffRL
67
289
0
09 Jun 2021
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
193
4,580
0
07 Sep 2020
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
265
2,194
0
02 Sep 2020
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
165
1,754
0
02 Nov 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
604
19,318
0
20 Jul 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
224
3,380
0
12 Jun 2017
ConceptNet 5.5: An Open Multilingual Graph of General Knowledge
R. Speer
Joshua Chin
Catherine Havasi
233
2,910
0
12 Dec 2016
1