ChatBench: From Static Benchmarks to Human-AI Evaluation

22 March 2025

Papers citing "ChatBench: From Static Benchmarks to Human-AI Evaluation"

2 / 2 papers shown

Title
LLMs Get Lost In Multi-Turn Conversation Philippe Laban Hiroaki Hayashi Yingbo Zhou Jennifer Neville 25 0 0 09 May 2025
LLMs Outperform Experts on Challenging Biology Benchmarks Lennart Justen ELM 15 0 0 09 May 2025