Finding Blind Spots in Evaluator LLMs with Interpretable Checklists

19 June 2024

Papers citing "Finding Blind Spots in Evaluator LLMs with Interpretable Checklists"

4 / 4 papers shown

Title
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Dawei Li Bohan Jiang Liangjie Huang Alimohammad Beigi Chengshuai Zhao ... Canyu Chen Tianhao Wu Kai Shu Lu Cheng Huan Liu ELM AILaw 99 61 0 25 Nov 2024
HEALTH-PARIKSHA: Assessing RAG Models for Health Chatbots in Real-World Multilingual Settings Varun Gumma Anandhita Raghunath Mohit Jain Sunayana Sitaram LM&MA 14 1 0 17 Oct 2024
Can Large Language Models Be an Alternative to Human Evaluations? Cheng-Han Chiang Hung-yi Lee ALM LM&MA 201 559 0 03 May 2023
Perturbation CheckLists for Evaluating NLG Evaluation Metrics Ananya B. Sai Tanay Dixit D. Y. Sheth S. Mohan Mitesh M. Khapra AAML 85 55 0 13 Sep 2021