Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.02179
Cited By
Training Chain-of-Thought via Latent-Variable Inference
28 November 2023
Du Phan
Matthew D. Hoffman
David Dohan
Sholto Douglas
Tuan Anh Le
Aaron T Parisi
Pavel Sountsov
Charles Sutton
Sharad Vikram
Rif A. Saurous
BDL
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Chain-of-Thought via Latent-Variable Inference"
23 / 23 papers shown
Title
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao
Yifan Hao
Hanning Zhang
Hanze Dong
Wei Xiong
Nan Jiang
Tong Zhang
LRM
50
0
0
05 May 2025
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
J. Zhang
Flood Sung
Z. Yang
Yang Gao
Chongjie Zhang
LLMAG
38
0
0
28 Apr 2025
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
Zijian Wang
Chang Xu
LRM
21
1
0
09 Apr 2025
Learning to chain-of-thought with Jensen's evidence lower bound
Yunhao Tang
Sid Wang
Rémi Munos
BDL
OffRL
LRM
50
0
0
25 Mar 2025
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
Kanishk Gandhi
Ayush Chakravarthy
Anikait Singh
Nathan Lile
Noah D. Goodman
ReLM
LRM
88
28
0
03 Mar 2025
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang
Tianyang Liu
Daoan Zhang
Antoine Simoulin
Xiaoyi Liu
...
Zhaopu Teng
Xin Qian
Grey Yang
Jiebo Luo
Julian McAuley
ReLM
OffRL
LRM
81
3
0
26 Feb 2025
Scalable Language Models with Posterior Inference of Latent Thought Vectors
Deqian Kong
Minglu Zhao
Dehong Xu
Bo Pang
Shu Wang
...
Zhangzhang Si
Chuan Li
Jianwen Xie
Sirui Xie
Ying Nian Wu
VLM
LRM
BDL
76
5
0
03 Feb 2025
Supervision-free Vision-Language Alignment
Giorgio Giannone
Ruoteng Li
Qianli Feng
Evgeny Perevodchikov
Rui Chen
Aleix M. Martinez
VLM
58
0
0
08 Jan 2025
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models
Linhao Luo
Zicheng Zhao
Chen Gong
Gholamreza Haffari
Shirui Pan
RALM
LRM
32
4
0
16 Oct 2024
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao
Hanze Dong
Amrita Saha
Caiming Xiong
Doyen Sahoo
LRM
27
3
0
10 Oct 2024
Can a Bayesian Oracle Prevent Harm from an Agent?
Yoshua Bengio
Michael K. Cohen
Nikolay Malkin
Matt MacDermott
Damiano Fornasiere
Pietro Greiner
Younesse Kaddar
34
4
0
09 Aug 2024
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Stephen Zhao
Rob Brekelmans
Alireza Makhzani
Roger C. Grosse
27
9
0
26 Apr 2024
NExT: Teaching Large Language Models to Reason about Code Execution
Ansong Ni
Miltiadis Allamanis
Arman Cohan
Yinlin Deng
Kensen Shi
Charles Sutton
Pengcheng Yin
ReLM
LRM
23
34
0
23 Apr 2024
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
Yu Feng
Ben Zhou
Weidong Lin
Dan Roth
59
4
0
18 Apr 2024
STaR-GATE: Teaching Language Models to Ask Clarifying Questions
Chinmaya Andukuri
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
SyDa
LRM
35
27
0
28 Mar 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
42
108
0
14 Mar 2024
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D. Co-Reyes
Rishabh Agarwal
Ankesh Anand
Piyush Patil
...
Yamini Bansal
Ethan Dyer
Behnam Neyshabur
Jascha Narain Sohl-Dickstein
Noah Fiedel
ALM
LRM
ReLM
SyDa
147
143
0
11 Dec 2023
Amortizing intractable inference in large language models
Marvin Schmitt
Moksh Jain
Daniel Habermann
Younesse Kaddar
Ullrich Kothe
Stefan T. Radev
Nikolay Malkin
AIFin
BDL
19
45
0
06 Oct 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Markovian Score Climbing: Variational Inference with KL(p||q)
C. A. Naesseth
Fredrik Lindsten
David M. Blei
108
54
0
23 Mar 2020
1