Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.16437
Cited By
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?
25 March 2024
Junkai Chen
Zhiyuan Pan
Xing Hu
Zhenhao Li
Ge Li
Xin Xia
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reasoning Runtime Behavior of a Program with LLM: How Far Are We?"
10 / 10 papers shown
Title
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
Simeng Sun
Cheng-Ping Hsieh
Faisal Ladhak
Erik Arakelyan
Santiago Akle Serano
Boris Ginsburg
ReLM
ELM
LRM
49
0
0
28 Mar 2025
Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code
Shahin Honarvar
Mark van der Wilk
Alastair Donaldson
74
6
0
28 Jan 2025
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Miltiadis Allamanis
Sheena Panthaplackel
Pengcheng Yin
ALM
OffRL
LRM
43
9
0
13 Feb 2024
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
141
163
0
03 May 2023
Unpacking Large Language Models with Conceptual Consistency
Pritish Sahu
Michael Cogswell
Yunye Gong
Ajay Divakaran
LRM
79
16
0
29 Sep 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
76
211
0
25 Apr 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
190
853
0
09 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
255
343
0
01 Feb 2021
1