ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05685
  4. Cited By
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

9 June 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
Yonghao Zhuang
Zi Lin
Zhuohan Li
Dacheng Li
Eric P. Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
    ALM
    OSLM
    ELM
ArXivPDFHTML

Papers citing "Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena"

30 / 2,880 papers shown
Title
Lawyer LLaMA Technical Report
Lawyer LLaMA Technical Report
Quzhe Huang
Mingxu Tao
Chen Zhang
Zhenwei An
Cong Jiang
Zhibin Chen
Zirui Wu
Yansong Feng
ELM
ALM
AILaw
34
49
0
24 May 2023
In-Context Impersonation Reveals Large Language Models' Strengths and
  Biases
In-Context Impersonation Reveals Large Language Models' Strengths and Biases
Leonard Salewski
Stephan Alaniz
Isabel Rio-Torto
Eric Schulz
Zeynep Akata
44
149
0
24 May 2023
Automatic Model Selection with Large Language Models for Reasoning
Automatic Model Selection with Large Language Models for Reasoning
Xu Zhao
Yuxi Xie
Kenji Kawaguchi
Junxian He
Qizhe Xie
ReLM
LRM
34
27
0
23 May 2023
Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation
Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation
Da Yin
Xiao Liu
Fan Yin
Ming Zhong
Hritik Bansal
Jiawei Han
Kai-Wei Chang
ALM
34
37
0
23 May 2023
QTSumm: Query-Focused Summarization over Tabular Data
QTSumm: Query-Focused Summarization over Tabular Data
Yilun Zhao
Zhenting Qi
Linyong Nan
Boyu Mi
Yixin Liu
...
Ruizhe Chen
Xiangru Tang
Yumo Xu
Dragomir R. Radev
Arman Cohan
RALM
LMTD
33
1
0
23 May 2023
INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained
  Feedback
INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback
Wenda Xu
Danqing Wang
Liangming Pan
Zhenqiao Song
Markus Freitag
Luu Anh Tuan
Lei Li
ALM
ELM
36
17
0
23 May 2023
Enhancing Large Language Models Against Inductive Instructions with
  Dual-critique Prompting
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting
Rui Wang
Hongru Wang
Fei Mi
Yi Chen
Boyang Xue
Kam-Fai Wong
Rui-Lan Xu
29
13
0
23 May 2023
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large
  Language Models in Knowledge Conflicts
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Jian Xie
Kai Zhang
Jiangjie Chen
Renze Lou
Yu-Chuan Su
RALM
211
155
0
22 May 2023
AlpacaFarm: A Simulation Framework for Methods that Learn from Human
  Feedback
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Yann Dubois
Xuechen Li
Rohan Taori
Tianyi Zhang
Ishaan Gulrajani
Jimmy Ba
Carlos Guestrin
Percy Liang
Tatsunori B. Hashimoto
ALM
45
542
0
22 May 2023
CLASS: A Design Framework for building Intelligent Tutoring Systems
  based on Learning Science principles
CLASS: A Design Framework for building Intelligent Tutoring Systems based on Learning Science principles
Shashank Sonkar
Lucy Liu
D. B. Mallick
Richard G. Baraniuk
57
38
0
22 May 2023
DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated
  Text Detection
DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection
Xiao Yu
Yuang Qi
Kejiang Chen
Guoqiang Chen
Xi Yang
Pengyuan Zhu
Xiuwei Shang
Weiming Zhang
Neng H. Yu
DeLMO
13
11
0
21 May 2023
Evaluating the Performance of Large Language Models on GAOKAO Benchmark
Evaluating the Performance of Large Language Models on GAOKAO Benchmark
Xiaotian Zhang
Chun-yan Li
Yi Zong
Zhengyu Ying
Liang He
Xipeng Qiu
ALM
ELM
16
97
0
21 May 2023
InstructIE: A Bilingual Instruction-based Information Extraction Dataset
InstructIE: A Bilingual Instruction-based Information Extraction Dataset
Honghao Gui
Shuofei Qiao
Jintian Zhang
Hongbin Ye
Mengshu Sun
Lei Liang
Jeff Z. Pan
Huajun Chen
Ningyu Zhang
31
7
0
19 May 2023
Automatic Evaluation of Attribution by Large Language Models
Automatic Evaluation of Attribution by Large Language Models
Xiang Yue
Boshi Wang
Ziru Chen
Kai Zhang
Yu-Chuan Su
Huan Sun
ALM
LRM
HILM
35
54
0
10 May 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
224
572
0
03 May 2023
A Comprehensive Evaluation of Neural SPARQL Query Generation from
  Natural Language Questions
A Comprehensive Evaluation of Neural SPARQL Query Generation from Natural Language Questions
Papa Abdou Karim Karou Diallo
Samuel Reyd
Amal Zouaq
11
6
0
16 Apr 2023
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Haoran Li
Dadi Guo
Wei Fan
Mingshi Xu
Jie Huang
Fanpu Meng
Yangqiu Song
SILM
47
321
0
11 Apr 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
159
579
0
06 Apr 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
53
1,078
0
29 Mar 2023
Error Analysis Prompting Enables Human-Like Translation Evaluation in
  Large Language Models
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
Qingyu Lu
Baopu Qiu
Liang Ding
Liping Xie
Tom Kocmi
Dacheng Tao
LRM
ALM
ELM
26
107
0
24 Mar 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
298
2,232
0
22 Mar 2023
SAINE: Scientific Annotation and Inference Engine of Scientific Research
SAINE: Scientific Annotation and Inference Engine of Scientific Research
Susie Xi Rao
Yi-Lin Tu
P. Egger
19
1
0
28 Feb 2023
Guiding Large Language Models via Directional Stimulus Prompting
Guiding Large Language Models via Directional Stimulus Prompting
Zekun Li
Baolin Peng
Pengcheng He
Michel Galley
Jianfeng Gao
Xi Yan
LLMAG
LRM
LM&Ro
40
94
0
22 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Using In-Context Learning to Improve Dialogue Safety
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
30
38
0
02 Feb 2023
Quality at the Tail of Machine Learning Inference
Quality at the Tail of Machine Learning Inference
Zhengxin Yang
Wanling Gao
Chunjie Luo
Lei Wang
Fei Tang
Xu Wen
Jianfeng Zhan
38
1
0
25 Dec 2022
Ontologically Faithful Generation of Non-Player Character Dialogues
Ontologically Faithful Generation of Non-Player Character Dialogues
Nathaniel Weir
Ryan Thomas
Randolph DÁmore
Kellie Hill
Benjamin Van Durme
Harsh Jhamtani
31
6
0
20 Dec 2022
Defending Against Disinformation Attacks in Open-Domain Question
  Answering
Defending Against Disinformation Attacks in Open-Domain Question Answering
Orion Weller
Aleem Khan
Nathaniel Weir
Dawn J Lawrie
Benjamin Van Durme
AAML
70
4
0
20 Dec 2022
Can large language models reason about medical questions?
Can large language models reason about medical questions?
Valentin Liévin
C. Hother
Andreas Geert Motzfeldt
Ole Winther
ELM
LM&MA
AI4MH
LRM
26
299
0
17 Jul 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
382
8,495
0
28 Jan 2022
Previous
123...565758