ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.18433
  4. Cited By
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM
  Performance and Generalization

Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

27 September 2024
Mucong Ding
Chenghao Deng
Jocelyn Choo
Zichu Wu
Aakriti Agrawal
Avi Schwarzschild
Tianyi Zhou
Tom Goldstein
John Langford
Anima Anandkumar
Furong Huang
ArXivPDFHTML

Papers citing "Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization"

3 / 3 papers shown
Title
A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network
A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network
Haoxiang Luo
Gang Sun
Yinqiu Liu
Dongcheng Zhao
Dusit Niyato
Hongfang Yu
Schahram Dustdar
33
0
0
08 May 2025
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis
Wenbo Zhang
Hengrui Cai
Wenyu Chen
77
0
0
17 Feb 2025
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
Aakriti Agrawal
Mucong Ding
Zora Che
Chenghao Deng
Anirudh Satheesh
John Langford
Furong Huang
39
4
0
06 Oct 2024
1