CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists

CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists

27 March 2024

Pilsung Kang

Najoung Kim

Papers citing "CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists"

4 / 4 papers shown

Title
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu JongWoo Kim MunYong Yi 53 1 0 21 Feb 2025
SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists Raoyuan Zhao Abdullatif Köksal Yihong Liu Leonie Weissweiler Anna Korhonen Hinrich Schütze SyDa 28 1 0 30 Aug 2024
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models Yukyung Lee Soonwon Ka Bokyung Son Pilsung Kang Jaewook Kang LLMAG 42 6 0 22 Apr 2024
Finding a Balanced Degree of Automation for Summary Evaluation Shiyue Zhang Mohit Bansal 47 43 0 23 Sep 2021