Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2404.10952
Cited By
Can Language Models Solve Olympiad Programming?
16 April 2024
Quan Shi
Michael Tang
Karthik Narasimhan
Shunyu Yao
ELM
LRM
ReLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Can Language Models Solve Olympiad Programming?"
39 / 39 papers shown
Title
ChemLabs on ChemO: A Multi-Agent System for Multimodal Reasoning on IChO 2025
Xu Qiang
Shengyuan Bai
Leqing Chen
Zijing Liu
Yu-Feng Li
LRM
116
0
0
20 Nov 2025
SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning
Mexican International Conference on Artificial Intelligence (MICAI), 2025
Xuchen Li
Ruitao Wu
Xuanbo Liu
Xukai Wang
Jinbo Hu
...
K. Huang
J. Xu
Haitao Mi
Wentao Zhang
Bin Dong
LLMAG
LM&Ro
LRM
AI4CE
594
1
0
11 Nov 2025
Secure Code Generation at Scale with Reflexion
Arup Datta
Ahmed Aljohani
Hyunsook Do
ELM
68
0
0
05 Nov 2025
QueST: Incentivizing LLMs to Generate Difficult Problems
Hanxu Hu
Xingxing Zhang
Jannis Vamvas
Rico Sennrich
Furu Wei
AIMat
SyDa
MQ
LRM
163
0
0
20 Oct 2025
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Minhua Lin
Zongyu Wu
Zhichao Xu
Hui Liu
Xianfeng Tang
Qi He
Charu C. Aggarwal
Hui Liu
Xiang Zhang
Suhang Wang
AI4TS
LRM
356
1
0
19 Oct 2025
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
A. Zebaze
Rachel Bawden
Benoît Sagot
LRM
68
1
0
13 Oct 2025
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
Kaijian Zou
Aaron Xiong
Yunxiang Zhang
Frederick Zhang
Yueqi Ren
Jirong Yang
Ayoung Lee
Shitanshu Bhushan
Lu Wang
ReLM
ALM
ELM
LRM
61
1
0
10 Oct 2025
Scaling Laws for Code: A More Data-Hungry Regime
Xianzhen Luo
Wenzhen Zheng
Qingfu Zhu
Rongyi Zhang
Houyi Li
Siming Huang
YuanTao Fan
Wanxiang Che
ALM
88
1
0
09 Oct 2025
PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning
Wanjia Zhao
Qinwei Ma
Jingzhe Shi
Shirley Wu
Jiaqi Han
Yijia Xiao
S. Chen
Xiao Luo
Ludwig Schmidt
James Zou
LRM
72
0
0
03 Oct 2025
Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm
Dadi Guo
Tianyi Zhou
Dongrui Liu
Chen Qian
Qihan Ren
...
Zhiyuan Fan
Yi R. Fung
Kun Wang
Linfeng Zhang
Jing Shao
108
0
0
01 Oct 2025
Can Multi-turn Self-refined Single Agent LMs with Retrieval Solve Hard Coding Problems?
Md Tanzib Hosain
Md Kishor Morol
ReLM
LRM
66
2
0
30 Aug 2025
AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions
Zihan Wang
Jiaze Chen
Zhicheng Liu
Markus Mak
Yidi Du
...
Y. Wu
Daoguang Zan
Y. Fu
Mingxuan Wang
Ming Ding
ELM
74
3
0
22 Aug 2025
Klear-CodeTest: Scalable Test Case Generation for Code Reinforcement Learning
Jia-Yi Fu
Xinyu Yang
Hongzhi Zhang
Yahui Liu
Jingyuan Zhang
Qi Wang
Fuzheng Zhang
Guorui Zhou
ELM
175
1
0
07 Aug 2025
A Survey of LLM-based Deep Search Agents: Paradigm, Optimization, Evaluation, and Challenges
Yunjia Xi
Jianghao Lin
Yongzhao Xiao
Zheli Zhou
Rong Shan
Te Gao
Jiachen Zhu
Weiwen Liu
Yong Yu
Weinan Zhang
LLMAG
ELM
226
15
0
03 Aug 2025
AlgoSimBench: Identifying Algorithmically Similar Problems for Competitive Programming
Jierui Li
Raymond J. Mooney
124
0
0
21 Jul 2025
OJBench: A Competition Level Code Benchmark For Large Language Models
Zhexu Wang
Y. Liu
Yejie Wang
Wenyang He
Bofei Gao
...
Kelin Fu
Flood Sung
Zhilin Yang
Tianyu Liu
Weiran Xu
ReLM
LRM
ELM
172
3
0
19 Jun 2025
OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics
Yaoming Zhu
Junxin Wang
Yiyang Li
Lin Qiu
Zongyu Wang
...
Xuezhi Cao
Yuhuai Wei
Mingshi Wang
Xunliang Cai
Rong Ma
LRM
268
3
0
12 Jun 2025
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
Yuki Imajuku
Kohki Horie
Yoichi Iwata
Kensho Aoki
Naohiro Takahashi
Takuya Akiba
164
6
0
10 Jun 2025
Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems
Yuhan Cao
Z. Chen
Kun Quan
Ziliang Zhang
Yu Wang
...
Shouchen Zhou
Yuxiang Zhu
Yiming Huang
Tian Xie
Tianxing He
ELM
LRM
189
3
0
07 Jun 2025
Sample Complexity and Representation Ability of Test-time Scaling Paradigms
Baihe Huang
Shanda Li
Tianhao Wu
Yiming Yang
Ameet Talwalkar
Kannan Ramchandran
Michael I. Jordan
Jiantao Jiao
LRM
291
1
0
05 Jun 2025
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
M. Andreux
Breno Baldas Skuk
Hamza Benchekroun
Emilien Biré
Antoine Bonnet
...
Marc Thibault
L. Thiry
Léo Tronchon
Nicolas Usunier
Tony Wu
LLMAG
164
4
0
03 Jun 2025
ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xinwei Yang
Zhaofeng Liu
Chen Huang
Jiashuai Zhang
Tong Zhang
Yifan Zhang
Wenqiang Lei
123
3
0
22 May 2025
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
Peter Chen
Xiaopeng Li
Zhiyu Li
Xi Chen
Tianyi Lin
407
0
0
16 May 2025
Synergizing RAG and Reasoning: A Systematic Review
Yunfan Gao
Yun Xiong
Yijie Zhong
Yuxi Bi
Ming Xue
Haoyu Wang
LRM
AI4CE
931
22
0
22 Apr 2025
IMPersona: Evaluating Individual Level LM Impersonation
Quan Shi
Carlos E. Jimenez
Stephen Dong
Brian Seo
Caden Yao
Adam Kelch
Karthik Narasimhan
215
1
0
06 Apr 2025
HoarePrompt: Structural Reasoning About Program Correctness in Natural Language
Dimitrios Stamatios Bouras
Yihan Dai
Tairan Wang
Yingfei Xiong
Sergey Mechtaev
LRM
350
1
0
25 Mar 2025
ProBench: Benchmarking Large Language Models in Competitive Programming
Lei Yang
Renren Jin
Ling Shi
Jianxiang Peng
Yue Chen
Deyi Xiong
ReLM
ELM
LRM
152
8
0
28 Feb 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yancheng He
Shilong Li
Jing Liu
Weixun Wang
Xingyuan Bu
...
Zhongyuan Peng
Zhenru Zhang
Zhicheng Zheng
Yuchi Xu
Bo Zheng
ELM
LRM
426
39
0
26 Feb 2025
KernelBench: Can LLMs Write Efficient GPU Kernels?
Anne Ouyang
Simon Guo
Simran Arora
Alex L. Zhang
William Hu
Christopher Ré
Azalia Mirhoseini
ALM
296
41
0
14 Feb 2025
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Shanghaoran Quan
Jiaxi Yang
Bowen Yu
Jian Xu
Dayiheng Liu
...
Zeyu Cui
Yang Fan
Yanzhe Zhang
Binyuan Hui
Junyang Lin
ALM
ELM
LRM
297
70
0
02 Jan 2025
Mixture of Parrots: Experts improve memorization more than reasoning
International Conference on Learning Representations (ICLR), 2024
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
294
14
0
24 Oct 2024
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Siwei Wu
Zhongyuan Peng
Xinrun Du
Tuney Zheng
Minghao Liu
...
Rundong Wang
Wenhao Huang
Ge Zhang
Chenghua Lin
J. H. Liu
ELM
LLMAG
LRM
AI4CE
222
67
0
17 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
182
34
0
05 Oct 2024
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Xingxuan Li
Weiwen Xu
Ruochen Zhao
Fangkai Jiao
Shafiq Joty
Lidong Bing
LRM
198
24
0
02 Oct 2024
Game On: Towards Language Models as RL Experimenters
Jingwei Zhang
Thomas Lampe
A. Abdolmaleki
Jost Tobias Springenberg
Martin Riedmiller
LM&Ro
174
0
0
05 Sep 2024
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Neural Information Processing Systems (NeurIPS), 2024
Yijia Shao
Tianshi Li
Weiyan Shi
Yanchen Liu
Diyi Yang
PILM
455
74
0
29 Aug 2024
AI Agents That Matter
Sayash Kapoor
Benedikt Stroebl
Zachary S. Siegel
Nitya Nadgir
Arvind Narayanan
224
83
0
01 Jul 2024
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang
Zengzhi Wang
Shijie Xia
Xuefeng Li
Haoyang Zou
...
Yuxiang Zheng
Shaoting Zhang
Dahua Lin
Yu Qiao
Pengfei Liu
ELM
LRM
237
68
0
18 Jun 2024
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek-AI
Qihao Zhu
Daya Guo
Zhihong Shao
Dejian Yang
...
Jiashi Li
Chenggang Zhao
Chong Ruan
Fuli Luo
Wenfeng Liang
MoE
LRM
ELM
VLM
224
341
0
17 Jun 2024
1