ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.13076
  4. Cited By
LLM Evaluators Recognize and Favor Their Own Generations

LLM Evaluators Recognize and Favor Their Own Generations

15 April 2024
Arjun Panickssery
Samuel R. Bowman
Shi Feng
ArXivPDFHTML

Papers citing "LLM Evaluators Recognize and Favor Their Own Generations"

10 / 110 papers shown
Title
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
Hyunseok Lee
Jihoon Tack
Jinwoo Shin
DeLMO
24
0
0
27 May 2024
Elements of World Knowledge (EWOK): A cognition-inspired framework for
  evaluating basic world knowledge in language models
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models
Anna A. Ivanova
Aalok Sathe
Benjamin Lipkin
Unnathi Kumar
S. Radkani
...
Leshem Choshen
Roger Levy
Evelina Fedorenko
Josh Tenenbaum
Jacob Andreas
25
23
0
15 May 2024
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated
  clinical environments
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments
Samuel Schmidgall
Rojin Ziaei
Carl Harris
Eduardo Reis
Jeffrey Jopling
Michael Moor
38
42
0
13 May 2024
DOLOMITES: Domain-Specific Long-Form Methodical Tasks
DOLOMITES: Domain-Specific Long-Form Methodical Tasks
Chaitanya Malaviya
Priyanka Agrawal
Kuzman Ganchev
Pranesh Srinivasan
Fantine Huot
Jonathan Berant
Mark Yatskar
Dipanjan Das
Mirella Lapata
Chris Alberti
19
6
0
09 May 2024
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of
  Diverse Models
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Pat Verga
Sebastian Hofstatter
Sophia Althammer
Yixuan Su
Aleksandra Piktus
Arkady Arkhangorodsky
Minjie Xu
Naomi White
Patrick Lewis
ALM
ELM
24
87
0
29 Apr 2024
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive
  Study
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
Van Bach Nguyen
Paul Youssef
Jorg Schlotterer
Christin Seifert
37
14
0
26 Apr 2024
DACO: Towards Application-Driven and Comprehensive Data Analysis via
  Code Generation
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
Xueqing Wu
Rui Zheng
Jingzhen Sha
Te-Lin Wu
Hanyu Zhou
Mohan Tang
Kai-Wei Chang
Nanyun Peng
Haoran Huang
47
1
0
04 Mar 2024
Feedback Loops With Language Models Drive In-Context Reward Hacking
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan
Erik Jones
Meena Jagadeesan
Jacob Steinhardt
KELM
42
25
0
09 Feb 2024
Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in
  Self-Refined Open-Source Models
Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models
Sumuk Shashidhar
Abhinav Chinta
Vaibhav Sahai
Zhenhailong Wang
Heng Ji
32
8
0
11 Oct 2023
LLark: A Multimodal Instruction-Following Language Model for Music
LLark: A Multimodal Instruction-Following Language Model for Music
Josh Gardner
Simon Durand
Daniel Stoller
Rachel M. Bittner
AuLLM
15
14
0
11 Oct 2023
Previous
123