ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.21199
  4. Cited By
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

3 January 2025
Zhaojian Yu
Yilun Zhao
Arman Cohan
Xiao-Ping Zhang
    LRM
ArXivPDFHTML

Papers citing "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation"

2 / 2 papers shown
Title
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
Lennart Luettgau
Harry Coppock
Magda Dubois
Christopher Summerfield
Cozmin Ududec
9
0
0
08 May 2025
CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation
CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation
Sizhe Wang
Z. Wang
Dongsheng Ma
Yongan Yu
Rui Ling
Z. Li
Feiyu Xiong
W. Zhang
LRM
45
0
0
30 Apr 2025
1