Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.13957
Cited By
How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO
22 April 2024
Man Tik Ng
Hui Tung Tse
Jen-tse Huang
Jingjing Li
Wenxuan Wang
Michael R. Lyu
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO"
5 / 5 papers shown
Title
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
Ilya Gusev
LLMAG
32
3
0
10 Sep 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Jen-tse Huang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELM
LLMAG
74
32
0
18 Mar 2024
Humans or LLMs as the Judge? A Study on Judgement Biases
Guiming Hardy Chen
Shunian Chen
Ziche Liu
Feng Jiang
Benyou Wang
56
89
0
16 Feb 2024
A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models
Yuxuan Wan
Wenxuan Wang
Yiliu Yang
Youliang Yuan
Jen-tse Huang
Pinjia He
Wenxiang Jiao
Michael R. Lyu
ELM
LRM
42
11
0
01 Jan 2024
CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
Jinfeng Zhou
Zhuang Chen
Dazhen Wan
Bosi Wen
Yi Song
...
Wenjing Hou
Yijia Zhang
Yuxiao Dong
Jie Tang
Minlie Huang
LLMAG
AI4CE
OSLM
85
11
0
28 Nov 2023
1