ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18536
  4. Cited By
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

24 May 2025
Haoyuan Sun
Jiaqi Wu
Bo Xia
Yifu Luo
Yifei Zhao
Kai Qin
Xufei Lv
Tiantian Zhang
Yongzhe Chang
Xueqian Wang
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models"

6 / 106 papers shown
Title
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
168
8,805
0
04 Feb 2016
Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
67
3,742
0
20 Nov 2015
Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
134
7,590
0
22 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
45
3,368
0
08 Jun 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
245
6,722
0
19 Feb 2015
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
103
12,163
0
19 Dec 2013
Previous
123