ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05074
  4. Cited By
DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller
  Language Models

DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

8 October 2023
Chengcheng Han
Xiaowei Du
Che Zhang
Yixin Lian
Xiang Li
Ming Gao
Baoyuan Wang
    LRM
ArXivPDFHTML

Papers citing "DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models"

6 / 6 papers shown
Title
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer
  Selection in Large Language Models
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Zhiyuan Zeng
Xiaonan Li
...
Qinyuan Cheng
Ding Wang
Xiaofeng Mou
Xipeng Qiu
XuanJing Huang
LRM
41
3
0
21 May 2024
Compositional Semantic Parsing with Large Language Models
Compositional Semantic Parsing with Large Language Models
Andrew Drozdov
Nathanael Scharli
Ekin Akyuurek
Nathan Scales
Xinying Song
Xinyun Chen
Olivier Bousquet
Denny Zhou
ReLM
LRM
187
91
0
29 Sep 2022
Is a Question Decomposition Unit All We Need?
Is a Question Decomposition Unit All We Need?
Pruthvi H. Patel
Swaroop Mishra
Mihir Parmar
Chitta Baral
ReLM
130
50
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1