Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2505.21668
Cited By

R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

v1v2 (latest)

R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

27 May 2025

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)Github (23★)

Papers citing "R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning"

14 / 14 papers shown

A Survey on Agentic Multimodal Large Language Models

A Survey on Agentic Multimodal Large Language Models

...

LM&Ro AIFin AI4TS LRM AI4CE

246

4

0

13 Oct 2025

How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective

How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective

96

2

0

09 Oct 2025

Learning to Reason for Hallucination Span Detection

Learning to Reason for Hallucination Span Detection

Hadi Pouransari

Raviteja Vemulapalli

ReLM OffRL HILM LRM

249

2

0

02 Oct 2025

Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning

Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning

Shubhashis Roy Dipta

Hengyuan Zhang

133

1

0

27 Sep 2025

Learning to Reason in Structured In-context Environments with Reinforcement Learning

Learning to Reason in Structured In-context Environments with Reinforcement Learning

177

0

0

27 Sep 2025

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

129

0

0

26 Sep 2025

NIRVANA: Structured pruning reimagined for large language models compression

NIRVANA: Structured pruning reimagined for large language models compression

1.6K

1

0

17 Sep 2025

ToolRL: Reward is All Tool Learning Needs

ToolRL: Reward is All Tool Learning Needs

Emre Can Acikgoz

Dilek Hakkani-Tur

536

146

0

16 Apr 2025

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

OffRL ReLM SyDa LRM VLM

486

171

0

10 Apr 2025

ToRL: Scaling Tool-Integrated RL

ToRL: Scaling Tool-Integrated RL

412

76

0

30 Mar 2025

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

385

200

0

17 Mar 2025

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

OffRL AI4TS LRM RALM ReLM KELM

807

560

0

12 Mar 2025

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Dale Schuurmans

675

404

0

28 Jan 2025

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

...

OffRL AI4TS LRM ReLM VLM

1.2K

5,342

0

22 Jan 2025