Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.16745
Cited By
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models
30 January 2024
Wai-Chung Kwan
Xingshan Zeng
Yuxin Jiang
Yufei Wang
Liangyou Li
Lifeng Shang
Xin Jiang
Qun Liu
Kam-Fai Wong
LRM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models"
11 / 11 papers shown
Title
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
23
0
0
09 May 2025
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
112
1
0
24 Feb 2025
InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context
Bryan L. M. de Oliveira
Luana G. B. Martins
Bruno Brandão
L. Melo
ELM
77
1
0
17 Feb 2025
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
Lize Alberts
Benjamin Ellis
Andrei Lupu
Jakob Foerster
ELM
34
0
0
28 Oct 2024
Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Wanying Wang
Zeyu Ma
Pengfei Liu
Mingang Chen
LLMAG
43
1
0
15 Oct 2024
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
Y. Li
Miao Zheng
Fan Yang
Guosheng Dong
Bin Cui
Weipeng Chen
Zenan Zhou
Wentao Zhang
ALM
26
5
0
12 Oct 2024
Post-hoc Reward Calibration: A Case Study on Length Bias
Zeyu Huang
Zihan Qiu
Zili Wang
Edoardo M. Ponti
Ivan Titov
36
5
0
25 Sep 2024
The use of GPT-4o and Other Large Language Models for the Improvement and Design of Self-Assessment Scales for Measurement of Interpersonal Communication Skills
Goran Bubaš
LM&MA
LLMAG
AI4MH
21
0
0
21 Sep 2024
M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models
Rishabh Maheshwary
Vikas Yadav
Hoang Nguyen
Khyati Mahajan
Sathwik Tejaswi Madhusudhan
35
3
0
24 Jun 2024
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
120
137
0
19 Sep 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1