ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.00750
  4. Cited By
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling

Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling

24 February 2025
Yiwen Ding
Zhiheng Xi
Wei He
Zhuoyuan Li
Yitao Zhai
Xiaowei Shi
Xunliang Cai
Tao Gui
Qi Zhang
Xuanjing Huang
    LRM
ArXivPDFHTML

Papers citing "Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling"

2 / 2 papers shown
Title
Improving RL Exploration for LLM Reasoning through Retrospective Replay
Improving RL Exploration for LLM Reasoning through Retrospective Replay
Shihan Dou
Muling Wu
Jingwen Xu
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
OffRL
LRM
32
0
0
19 Apr 2025
GiFT: Gibbs Fine-Tuning for Code Generation
GiFT: Gibbs Fine-Tuning for Code Generation
Haochen Li
Wanjin Feng
Xin Zhou
Zhiqi Shen
SyDa
75
1
0
17 Feb 2025
1