ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16667
123
3

ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming

Annual Meeting of the Association for Computational Linguistics (ACL), 2025
22 May 2025
Xinwei Yang
Zhaofeng Liu
Chen Huang
Jiashuai Zhang
Tong Zhang
Yifan Zhang
Wenqiang Lei
ArXiv (abs)PDFHTML
Main:8 Pages
5 Figures
Bibliography:5 Pages
19 Tables
Appendix:33 Pages
Abstract

While recent research increasingly emphasizes the value of human-LLM collaboration in competitive programming and proposes numerous empirical methods, a comprehensive understanding remains elusive due to the fragmented nature of existing studies and their use of diverse, application-specific human feedback. Thus, our work serves a three-fold purpose: First, we present the first taxonomy of human feedback consolidating the entire programming process, which promotes fine-grained evaluation. Second, we introduce ELABORATIONSET, a novel programming dataset specifically designed for human-LLM collaboration, meticulously annotated to enable large-scale simulated human feedback and facilitate costeffective real human interaction studies. Third, we introduce ELABORATION, a novel benchmark to facilitate a thorough assessment of human-LLM competitive programming. With ELABORATION, we pinpoint strengthes and weaknesses of existing methods, thereby setting the foundation for future improvement. Our code and dataset are available atthis https URL

View on arXiv
Comments on this paper