Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.05685
Cited By
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
9 June 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
Yonghao Zhuang
Zi Lin
Zhuohan Li
Dacheng Li
Eric P. Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena"
50 / 2,880 papers shown
Title
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
30
14
0
31 Jul 2023
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
Jun Yan
Vikas Yadav
Shiyang Li
Lichang Chen
Zheng Tang
Hai Wang
Vijay Srinivasan
Xiang Ren
Hongxia Jin
SILM
22
75
0
31 Jul 2023
NLLG Quarterly arXiv Report 06/23: What are the most influential current AI Papers?
Steffen Eger
Christoph Leiter
Jonas Belouadi
Ran Zhang
Aida Kostikova
Daniil Larionov
Yanran Chen
Vivian Fresen
AI4CE
29
4
0
31 Jul 2023
Camoscio: an Italian Instruction-tuned LLaMA
Andrea Santilli
Emanuele Rodolà
19
26
0
31 Jul 2023
HouYi: An open-source large language model specially designed for renewable energy and carbon neutrality field
Mingliang Bai
Zhihao Zhou
Ruidong Wang
Yusheng Yang
Zizhen Qin
Yunxia Chen
Chunjin Mu
Jinfu Liu
Daren Yu
13
2
0
31 Jul 2023
CHATREPORT: Democratizing Sustainability Disclosure Analysis through LLM-based Tools
Jingwei Ni
J. Bingler
Chiara Colesanti-Senni
Mathias Kraus
Glen Gostlow
...
Qian Wang
Nicolas Webersinke
Tobias Wekhof
Ting Yu
Markus Leippold
31
29
0
28 Jul 2023
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
89
1,266
0
27 Jul 2023
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
Liang Xu
Anqi Li
Lei Zhu
Han Xue
Changtai Zhu
Kangkang Zhao
Hao He
Xuanwei Zhang
Qiyue Kang
Zhenzhong Lan
RALM
ELM
LRM
12
51
0
27 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
38
118
0
25 Jul 2023
Fashion Matrix: Editing Photos by Just Talking
Zheng Chong
Xujie Zhang
Fuwei Zhao
Zhenyu Xie
Xiaodan Liang
DiffM
21
2
0
25 Jul 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELM
ALM
40
132
0
20 Jul 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye
Doyoung Kim
Sungdong Kim
Hyeonbin Hwang
Seungone Kim
Yongrae Jo
James Thorne
Juho Kim
Minjoon Seo
ALM
40
98
0
20 Jul 2023
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Jianguo Zhang
Kun Qian
Zhiwei Liu
Shelby Heinecke
Rui Meng
Ye Liu
Zhou Yu
Huan Wang
Silvio Savarese
Caiming Xiong
33
22
0
19 Jul 2023
Code Detection for Hardware Acceleration Using Large Language Models
Pablo Antonio Martínez
Gregorio Bernabé
J. M. García
25
2
0
19 Jul 2023
Emotional Intelligence of Large Language Models
Xuena Wang
Xueting Li
Zi Yin
Yue Wu
Tsinghua University
25
74
0
18 Jul 2023
AlpaGasus: Training A Better Alpaca with Fewer Data
Lichang Chen
Shiyang Li
Jun Yan
Hai Wang
Kalpa Gunaratna
...
Zheng Tang
Vijay Srinivasan
Dinesh Manocha
Heng-Chiao Huang
Hongxia Jin
ALM
44
0
0
17 Jul 2023
MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots
Gelei Deng
Yi Liu
Yuekang Li
Kailong Wang
Ying Zhang
Zefeng Li
Haoyu Wang
Tianwei Zhang
Yang Liu
SILM
37
118
0
16 Jul 2023
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models
Adian Liusie
Potsawee Manakul
Mark J. F. Gales
ELM
24
35
0
15 Jul 2023
Large Language Models Understand and Can be Enhanced by Emotional Stimuli
Cheng-rong Li
Jindong Wang
Yixuan Zhang
Kaijie Zhu
Wenxin Hou
Jianxun Lian
Fang Luo
Qiang Yang
Xingxu Xie
LRM
80
120
0
14 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Bo-wen Li
Songyang Zhang
...
Jiaqi Wang
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
29
907
0
12 Jul 2023
Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration
Zhenhailong Wang
Shaoguang Mao
Wenshan Wu
Tao Ge
Furu Wei
Heng Ji
LLMAG
LRM
26
132
0
11 Jul 2023
Emu: Generative Pretraining in Multimodality
Quan-Sen Sun
Qiying Yu
Yufeng Cui
Fan Zhang
Xiaosong Zhang
Yueze Wang
Hongcheng Gao
Jingjing Liu
Tiejun Huang
Xinlong Wang
MLLM
37
126
0
11 Jul 2023
Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models
Zhouhong Gu
Lin Zhang
Jiangjie Chen
Haoning Ye
Xiaoxuan Zhu
...
Jianchen Wang
Yikai Zhang
Wenhao Huang
Yanghua Xiao
Hongwei Feng
RALM
ELM
34
0
0
11 Jul 2023
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng
Shihan Dou
Songyang Gao
Yuan Hua
Wei Shen
...
Hang Yan
Tao Gui
Qi Zhang
Xipeng Qiu
Xuanjing Huang
ALM
OffRL
41
158
0
11 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
VLM
MLLM
85
224
0
07 Jul 2023
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
69
1,513
0
06 Jul 2023
Style Over Substance: Evaluation Biases for Large Language Models
Minghao Wu
Alham Fikri Aji
ALM
ELM
30
43
0
06 Jul 2023
What Should Data Science Education Do with Large Language Models?
Xinming Tu
James Zou
Weijie J. Su
Linjun Zhang
AI4Ed
39
32
0
06 Jul 2023
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Deepanway Ghosal
Yew Ken Chia
Navonil Majumder
Soujanya Poria
ALM
LRM
30
17
0
05 Jul 2023
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
Jinhao Duan
Hao-Ran Cheng
Shiqi Wang
Alex Zavalny
Chenan Wang
Renjing Xu
B. Kailkhura
Kaidi Xu
30
32
0
03 Jul 2023
Visual Instruction Tuning with Polite Flamingo
Delong Chen
Jianfeng Liu
Wenliang Dai
Baoyuan Wang
MLLM
34
42
0
03 Jul 2023
Preference Ranking Optimization for Human Alignment
Feifan Song
Yu Bowen
Minghao Li
Haiyang Yu
Fei Huang
Yongbin Li
Houfeng Wang
ALM
26
236
0
30 Jun 2023
On the Exploitability of Instruction Tuning
Manli Shu
Jiong Wang
Chen Zhu
Jonas Geiping
Chaowei Xiao
Tom Goldstein
SILM
31
91
0
28 Jun 2023
Composing Parameter-Efficient Modules with Arithmetic Operations
Jinghan Zhang
Shiqi Chen
Junteng Liu
Junxian He
KELM
MoMe
26
109
0
26 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu (Allen) Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
52
254
0
24 Jun 2023
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Daniel Zou
X. Jin
Xueyang Yu
Haotian Zhang
J. Demmel
MoE
29
0
0
24 Jun 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
27
3
0
22 Jun 2023
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
Shizhe Diao
Rui Pan
Hanze Dong
Kashun Shum
Jipeng Zhang
Wei Xiong
Tong Zhang
ALM
20
63
0
21 Jun 2023
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Xuan-Phi Nguyen
Sharifah Mahani Aljunied
Shafiq R. Joty
Lidong Bing
18
32
0
20 Jun 2023
CHORUS: Foundation Models for Unified Data Discovery and Exploration
Moe Kayali
A. Lykov
Ilias Fountalis
N. Vasiloglou
Dan Olteanu
Dan Suciu
25
21
0
16 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELM
ALM
36
66
0
15 Jun 2023
LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models
Peng-Tao Xu
Wenqi Shao
Kaipeng Zhang
Peng Gao
Shuo Liu
Meng Lei
Fanqing Meng
Siyuan Huang
Yu Qiao
Ping Luo
ELM
MLLM
33
159
0
15 Jun 2023
MiniLLM: Knowledge Distillation of Large Language Models
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
ALM
31
77
0
14 Jun 2023
Model Spider: Learning to Rank Pre-Trained Models Efficiently
Yi-Kai Zhang
Ting Huang
Yao-Xiang Ding
De-Chuan Zhan
Han-Jia Ye
31
23
0
06 Jun 2023
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Dongfu Jiang
Xiang Ren
Bill Yuchen Lin
ELM
22
268
0
05 Jun 2023
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
Shalev Lifshitz
Keiran Paster
Harris Chan
Jimmy Ba
Sheila A. McIlraith
LM&Ro
24
67
0
01 Jun 2023
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Chunyuan Li
Cliff Wong
Sheng Zhang
Naoto Usuyama
Haotian Liu
Jianwei Yang
Tristan Naumann
Hoifung Poon
Jianfeng Gao
LM&MA
MedIm
51
700
0
01 Jun 2023
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap
Q. V. Liao
Ziang Xiao
ALM
ELM
47
29
0
01 Jun 2023
Large Language Models are not Fair Evaluators
Peiyi Wang
Lei Li
Liang Chen
Zefan Cai
Dawei Zhu
Binghuai Lin
Yunbo Cao
Qi Liu
Tianyu Liu
Zhifang Sui
ALM
20
515
0
29 May 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
51
188
0
29 May 2023
Previous
1
2
3
...
56
57
58
Next