ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.20050
  4. Cited By
Let's Verify Step by Step

Let's Verify Step by Step

International Conference on Learning Representations (ICLR), 2023
31 May 2023
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
    ALMOffRLLRM
ArXiv (abs)PDFHTMLHuggingFace (10 upvotes)

Papers citing "Let's Verify Step by Step"

50 / 1,447 papers shown
Reasoning Paths Optimization: Learning to Reason and Explore From
  Diverse Paths
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse PathsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yew Ken Chia
Guizhen Chen
Weiwen Xu
Luu Anh Tuan
Soujanya Poria
Lidong Bing
LRM
253
5
0
07 Oct 2024
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample EfficiencyInternational Conference on Learning Representations (ICLR), 2024
Kaiyue Wen
Huaqing Zhang
Hongzhou Lin
Jingzhao Zhang
MoELRM
630
18
0
07 Oct 2024
Active Fine-Tuning of Multi-Task Policies
Active Fine-Tuning of Multi-Task Policies
Marco Bagatella
Jonas Hübotter
Georg Martius
Andreas Krause
584
0
0
07 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with
  Collaborative Verification
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
278
35
0
05 Oct 2024
Misinformation with Legal Consequences (MisLC): A New Task Towards
  Harnessing Societal Harm of Misinformation
Misinformation with Legal Consequences (MisLC): A New Task Towards Harnessing Societal Harm of MisinformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Chu Fei Luo
Radin Shayanfar
R. Bhambhoria
Samuel Dahan
Xiaodan Zhu
AILaw
242
4
0
04 Oct 2024
System 2 Reasoning Capabilities Are Nigh
System 2 Reasoning Capabilities Are Nigh
Scott C. Lowe
VLMLRM
204
2
0
04 Oct 2024
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level
  Mathematical Reasoning
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning
Di Zhang
Jianbo Wu
Jingdi Lei
Tong Che
Jiatong Li
...
Shufei Zhang
Marco Pavone
Yuqiang Li
Wanli Ouyang
Dongzhan Zhou
LRM
272
93
0
03 Oct 2024
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
Yifan Zhang
Ge Zhang
Yue Wu
Kangping Xu
Quanquan Gu
503
3
0
03 Oct 2024
Learning to Better Search with Language Models via Guided Reinforced Self-Training
Learning to Better Search with Language Models via Guided Reinforced Self-Training
Seungyong Moon
Bumsoo Park
Hyun Oh Song
AIFinRALM
313
4
0
03 Oct 2024
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
Huimu Yu
Xing Wu
Weidong Yin
Debing Zhang
Songlin Hu
LRM
347
9
0
03 Oct 2024
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-ImprovementInternational Conference on Learning Representations (ICLR), 2024
Xiangyu Peng
Congying Xia
Xinyi Yang
Caiming Xiong
Chien-Sheng Wu
Chen Xing
LRM
375
16
0
03 Oct 2024
GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning
GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning
Jiale Fu
Yaqing Wang
Simeng Han
Jiaming Fan
Chen Si
530
1
0
03 Oct 2024
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization AnalysisInternational Conference on Learning Representations (ICLR), 2024
Hongkang Li
Songtao Lu
Pin-Yu Chen
Xiaodong Cui
Meng Wang
LRM
561
12
0
03 Oct 2024
Evaluating Robustness of Reward Models for Mathematical Reasoning
Evaluating Robustness of Reward Models for Mathematical Reasoning
Sunghwan Kim
Dongjin Kang
Taeyoon Kwon
Hyungjoo Chae
Jungsoo Won
Dongha Lee
Jinyoung Yeo
220
15
0
02 Oct 2024
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with
  Retrieval-Augmentation for Solving Challenging Tasks
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Xingxuan Li
Weiwen Xu
Ruochen Zhao
Fangkai Jiao
Shafiq Joty
Lidong Bing
LRM
270
26
0
02 Oct 2024
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte CarloInternational Conference on Learning Representations (ICLR), 2024
Shengyu Feng
Xiang Kong
Shuang Ma
Aonan Zhang
Dong Yin
Chong-Jun Wang
Ruoming Pang
Yiming Yang
LRM
483
12
0
02 Oct 2024
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
Duy Nguyen
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
553
6
0
02 Oct 2024
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
TypedThinker: Diversify Large Language Model Reasoning with Typed ThinkingInternational Conference on Learning Representations (ICLR), 2024
Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
LLMAGLRM
930
2
0
02 Oct 2024
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
Jinghan Li
Zhicheng Sun
Fei Li
856
2
0
02 Oct 2024
RATIONALYST: Mining Implicit Rationales for Process Supervision of Reasoning
RATIONALYST: Mining Implicit Rationales for Process Supervision of Reasoning
Dongwei Jiang
Guoxuan Wang
Yining Lu
Andrew Wang
Jingyu Zhang
Chuyu Liu
Benjamin Van Durme
Daniel Khashabi
LRMReLM
224
3
0
01 Oct 2024
Inference-Time Language Model Alignment via Integrated Value Guidance
Inference-Time Language Model Alignment via Integrated Value GuidanceConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zhixuan Liu
Zhanhui Zhou
Yuanfu Wang
Chao Yang
Yu Qiao
188
17
0
26 Sep 2024
Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models
Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models
Tongxuan Liu
Wenjiang Xu
Weizhe Huang
Yuting Zeng
Jiaxing Wang
Hailong Yang
Hailong Yang
Jing Li
LRMReLM
337
24
0
26 Sep 2024
Direct Judgement Preference Optimization
Direct Judgement Preference Optimization
Peifeng Wang
Austin Xu
Yilun Zhou
Caiming Xiong
Shafiq Joty
ELM
396
24
0
23 Sep 2024
GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion
GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion
Tongxuan Liu
Xingyu Wang
Weizhe Huang
Wenjiang Xu
Yuting Zeng
Lei Jiang
Hailong Yang
Jing Li
LLMAG
394
46
0
21 Sep 2024
System 2 thinking in OpenAI's o1-preview model: Near-perfect performance
  on a mathematics exam
System 2 thinking in OpenAI's o1-preview model: Near-perfect performance on a mathematics examDe Computis (DC), 2024
J. D. Winter
Dimitra Dodou
Y. B. Eisma
VLMELMLRMReLM
355
22
0
19 Sep 2024
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced
  Mathematical Reasoning
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
Xiaotian Han
Yiren Jian
Xuefeng Hu
Haogeng Liu
Yiqi Wang
...
Yuang Ai
Huaibo Huang
Ran He
Zhenheng Yang
Quanzeng You
LRMAI4CE
224
38
0
19 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
LogicPro: Improving Complex Logical Reasoning via Program-Guided LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
Hao Fei
Xunliang Cai
Yixin Cao
Liangcai Gao
LRM
499
10
0
19 Sep 2024
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code RepairInternational Conference on Learning Representations (ICLR), 2024
Mingjie Liu
Yun-Da Tsai
Wenfei Zhou
Haoxing Ren
SyDa3DV
412
45
0
19 Sep 2024
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
Justin Chih-Yao Chen
Archiki Prasad
Swarnadeep Saha
Elias Stengel-Eskin
Joey Tianyi Zhou
LRM
603
36
0
18 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoningInternational Conference on Learning Representations (ICLR), 2024
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLMLRM
738
265
0
18 Sep 2024
OmniGen: Unified Image Generation
OmniGen: Unified Image GenerationComputer Vision and Pattern Recognition (CVPR), 2024
Shitao Xiao
Yueze Wang
Yueze Wang
Huaying Yuan
Xingrun Xing
Ruiran Yan
Shuting Wang
Tiejun Huang
Zheng Liu
DiffMVLMSyDa
496
299
0
17 Sep 2024
Quantile Regression for Distributional Reward Models in RLHF
Quantile Regression for Distributional Reward Models in RLHF
Nicolai Dorka
359
49
0
16 Sep 2024
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large
  Language Model
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model
Zhen Yang
Jinhao Chen
Zhengxiao Du
Wenmeng Yu
Weihan Wang
Wenyi Hong
Zhihuan Jiang
Bin Xu
Yuxiao Dong
Jie Tang
VLMLRM
213
15
0
10 Sep 2024
Programming Refusal with Conditional Activation Steering
Programming Refusal with Conditional Activation SteeringInternational Conference on Learning Representations (ICLR), 2024
Bruce W. Lee
Inkit Padhi
Karthikeyan N. Ramamurthy
Erik Miehling
Pierre Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
535
96
0
06 Sep 2024
Towards a Unified View of Preference Learning for Large Language Models:
  A Survey
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao
Feifan Song
Yibo Miao
Zefan Cai
Zhiyong Yang
...
Houfeng Wang
Zhifang Sui
Peiyi Wang
Baobao Chang
Baobao Chang
498
19
0
04 Sep 2024
Compositional 3D-aware Video Generation with LLM Director
Compositional 3D-aware Video Generation with LLM DirectorNeural Information Processing Systems (NeurIPS), 2024
Hanxin Zhu
Tianyu He
Anni Tang
Junliang Guo
Zhibo Chen
Jiang Bian
DiffMVGen
219
13
0
31 Aug 2024
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal
  Sampling
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal SamplingInternational Conference on Learning Representations (ICLR), 2024
Hritik Bansal
Arian Hosseini
Rishabh Agarwal
Vinh Q. Tran
Mehran Kazemi
SyDaOffRLLRM
304
67
0
29 Aug 2024
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts CriticAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Xin Zheng
Jie Lou
Boxi Cao
Xueru Wen
Yuqiu Ji
Hongyu Lin
Yaojie Lu
Xianpei Han
Debing Zhang
Le Sun
OffRLLRMLLMAGReLMKELM
578
27
1
29 Aug 2024
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation
  Strategy of Consistency Model
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model
Lifan Jiang
Zhihui Wang
Siqi Yin
Guangxiao Ma
Peng Zhang
Boxi Wu
DiffM
360
25
0
28 Aug 2024
Large Language Models Are Self-Taught Reasoners: Enhancing LLM
  Applications via Tailored Problem-Solving Demonstrations
Large Language Models Are Self-Taught Reasoners: Enhancing LLM Applications via Tailored Problem-Solving Demonstrations
Kai Tzu-iunn Ong
Taeyoon Kwon
Jinyoung Yeo
LRM
140
1
0
22 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow ThinkersInternational Conference on Learning Representations (ICLR), 2024
Guangyan Sun
Haoyang Ling
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAGLRM
579
53
0
16 Aug 2024
Problem Solving Through Human-AI Preference-Based Cooperation
Problem Solving Through Human-AI Preference-Based CooperationComputational Linguistics (CL), 2024
Subhabrata Dutta
Timo Kaufmann
Goran Glavaš
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
995
8
0
14 Aug 2024
Can Large Language Models Reason? A Characterization via 3-SAT
Can Large Language Models Reason? A Characterization via 3-SAT
Rishi Hazra
Gabriele Venturato
Pedro Zuidberg Dos Martires
Luc de Raedt
ELMReLMLRM
260
17
0
13 Aug 2024
Speculations on Uncertainty and Humane Algorithms
Speculations on Uncertainty and Humane Algorithms
Nicholas Gray
229
1
0
13 Aug 2024
Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives
Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Zhihu Wang
Shiwan Zhao
Yu Wang
Heyuan Huang
Sitao Xie
Xicheng Zhang
Jiaxin Shi
Zhixing Wang
Xue Yang
Junchi Yan
LRM
428
14
0
13 Aug 2024
Semantic Skill Grounding for Embodied Instruction-Following in
  Cross-Domain Environments
Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain EnvironmentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Sangwoo Shin
Takehiro Matsuoka
Youngsoo Jang
Moontae Lee
Kazuya Yoshida
446
0
0
02 Aug 2024
ThinK: Thinner Key Cache by Query-Driven Pruning
ThinK: Thinner Key Cache by Query-Driven PruningInternational Conference on Learning Representations (ICLR), 2024
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
626
46
0
30 Jul 2024
Meta-Rewarding Language Models: Self-Improving Alignment with
  LLM-as-a-Meta-Judge
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Tianhao Wu
Weizhe Yuan
O. Yu. Golovneva
Jing Xu
Yuandong Tian
Jiantao Jiao
Jason Weston
Sainbayar Sukhbaatar
ALMKELMLRM
435
171
0
28 Jul 2024
Prover-Verifier Games improve legibility of LLM outputs
Prover-Verifier Games improve legibility of LLM outputs
Jan Hendrik Kirchner
Yining Chen
Harri Edwards
Jan Leike
Nat McAleese
Yuri Burda
LRMAAML
332
56
0
18 Jul 2024
Weak-to-Strong Reasoning
Weak-to-Strong Reasoning
Yuqing Yang
Yan Ma
Pengfei Liu
LRM
402
31
0
18 Jul 2024
Previous
123...232425...272829
Next
Page 24 of 29
Pageof 29