v1v2v3 (latest)

AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners

22 May 2025

ArXiv (abs)PDF HTML Github (5★)

Papers citing "AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners"

16 / 16 papers shown

Aligning Reasoning LLMs for Materials Discovery with Physics-aware Rejection Sampling

...

164

31 Aug 2025

Inference-Time Scaling for Generalist Reward Modeling

503

152

03 Apr 2025

Z1: Efficient Test-time Scaling with Code

339

01 Apr 2025

Understanding R1-Zero-Like Training: A Critical Perspective

523

612

26 Mar 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

...

758

273

20 Mar 2025

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

...

629

1,006

18 Mar 2025

Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction Uncertainty

247

10 Feb 2025

Kimi k1.5: Scaling Reinforcement Learning with LLMs

...

OffRL ALM AI4TS VLM LRM

1.0K

702

22 Jan 2025

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

...

OffRL AI4TS LRM ReLM VLM

1.2K

5,342

22 Jan 2025

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

952

576

03 Jan 2025

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught ReasonersInternational Conference on Learning Representations (ICLR), 2024

499

23 Dec 2024

Automatic Curriculum Expert Iteration for Reliable LLM ReasoningInternational Conference on Learning Representations (ICLR), 2024

Hanze Dong

Caiming Xiong

356

10 Oct 2024

ReGenesis: LLMs can Grow into Reasoning Generalists via Self-ImprovementInternational Conference on Learning Representations (ICLR), 2024

Xiangyu Peng

Congying Xia

Xinyi Yang

Caiming Xiong

Chien-Sheng Wu

Chen Xing

LRM

320

03 Oct 2024

Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates

509

23 Aug 2024

Lean-STaR: Learning to Interleave Thinking and Proving

703

14 Jul 2024

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling LawsInternational Conference on Machine Learning (ICML), 2023

996

122

31 Dec 2023