Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1509.02971
Cited By

Continuous control with deep reinforcement learning

v1v2v3v4v5v6 (latest)

Continuous control with deep reinforcement learning

9 September 2015

Timothy Lillicrap

Jonathan J. Hunt

Alexander Pritzel

David Silver

ArXiv (abs)PDF HTML

Papers citing "Continuous control with deep reinforcement learning"

50 / 4,795 papers shown

Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging

Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options HedgingWorking papers (WP), 2025

Paweł Sakowski

Jakub Michañków

190

0

0

10 Oct 2025

Hierarchical Semantic RL: Tackling the Problem of Dynamic Action Space for RL-based Recommendations

Hierarchical Semantic RL: Tackling the Problem of Dynamic Action Space for RL-based Recommendations

133

0

0

10 Oct 2025

Control Synthesis of Cyber-Physical Systems for Real-Time Specifications through Causation-Guided Reinforcement Learning

Control Synthesis of Cyber-Physical Systems for Real-Time Specifications through Causation-Guided Reinforcement Learning

85

0

0

09 Oct 2025

Energy-Guided Diffusion Sampling for Long-Term User Behavior Prediction in Reinforcement Learning-based Recommendation

Energy-Guided Diffusion Sampling for Long-Term User Behavior Prediction in Reinforcement Learning-based Recommendation

104

0

0

09 Oct 2025

GRADE: Personalized Multi-Task Fusion via Group-relative Reinforcement Learning with Adaptive Dirichlet Exploration

GRADE: Personalized Multi-Task Fusion via Group-relative Reinforcement Learning with Adaptive Dirichlet Exploration

167

0

0

09 Oct 2025

Maximum In-Support Return Modeling for Dynamic Recommendation with Language Model Prior

Maximum In-Support Return Modeling for Dynamic Recommendation with Language Model Prior

88

0

0

09 Oct 2025

Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions

Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions

155

0

0

08 Oct 2025

What You Don't Know Can Hurt You: How Well do Latent Safety Filters Understand Partially Observable Safety Constraints?

What You Don't Know Can Hurt You: How Well do Latent Safety Filters Understand Partially Observable Safety Constraints?

Kensuke Nakamura

Andrea V. Bajcsy

103

0

0

07 Oct 2025

Controllable Audio-Visual Viewpoint Generation from 360° Spatial Information

Controllable Audio-Visual Viewpoint Generation from 360° Spatial Information

Christian Marinoni

R. F. Gramaccioni

Eleonora Grassucci

Danilo Comminiello

148

0

0

07 Oct 2025

DREAMer-VXS: A Latent World Model for Sample-Efficient AGV Exploration in Stochastic, Unobserved Environments

DREAMer-VXS: A Latent World Model for Sample-Efficient AGV Exploration in Stochastic, Unobserved Environments

Agniprabha Chakraborty

55

0

0

06 Oct 2025

HOFLON: Hybrid Offline Learning and Online Optimization for Process Start-Up and Grade-Transition Control

HOFLON: Hybrid Offline Learning and Online Optimization for Process Start-Up and Grade-Transition Control

Mehmet Mercangöz

262

0

0

04 Oct 2025

Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops

Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops

Mattia Scardecchia

166

0

0

04 Oct 2025

Physics-informed Neural-operator Predictive Control for Drag Reduction in Turbulent Flows

Physics-informed Neural-operator Predictive Control for Drag Reduction in Turbulent Flows

Kamyar Azizzadenesheli

Anima Anandkumar

114

0

0

03 Oct 2025

D2 Actor Critic: Diffusion Actor Meets Distributional Critic

D2 Actor Critic: Diffusion Actor Meets Distributional Critic

Bradly C. Stadie

263

1

0

03 Oct 2025

ExGRPO: Learning to Reason from Experience

ExGRPO: Learning to Reason from Experience

145

1

1

02 Oct 2025

From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning

From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning

Rafael Rodríguez-Sánchez

George Konidaris

184

2

0

02 Oct 2025

Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

Shaifalee Saxena

85

0

0

02 Oct 2025

Conflict-Based Search as a Protocol: A Multi-Agent Motion Planning Protocol for Heterogeneous Agents, Solvers, and Independent Tasks

Conflict-Based Search as a Protocol: A Multi-Agent Motion Planning Protocol for Heterogeneous Agents, Solvers, and Independent Tasks

Rishi Veerapaneni

...

Jon Arrizabalaga

Maxim Likhachev

92

1

0

01 Oct 2025

Multi-Actor Multi-Critic Deep Deterministic Reinforcement Learning with a Novel Q-Ensemble Method

Multi-Actor Multi-Critic Deep Deterministic Reinforcement Learning with a Novel Q-Ensemble Method

92

0

0

01 Oct 2025

Constant in an Ever-Changing World

Constant in an Ever-Changing World

60

0

0

01 Oct 2025

Deep Reinforcement Learning-Based Precoding for Multi-RIS-Aided Multiuser Downlink Systems with Practical Phase Shift

Deep Reinforcement Learning-Based Precoding for Multi-RIS-Aided Multiuser Downlink Systems with Practical Phase ShiftIEEE Wireless Communications Letters (WCL), 2025

Ronald Y. Chang

76

8

0

30 Sep 2025

Accelerating Transformers in Online RL

Accelerating Transformers in Online RL

Daniil Zelezetsky

Aleksandr I. Panov

143

0

0

30 Sep 2025

Diversity-Incentivized Exploration for Versatile Reasoning

Diversity-Incentivized Exploration for Versatile Reasoning

C. L. Philip Chen

146

2

0

30 Sep 2025

DyMoDreamer: World Modeling with Dynamic Modulation

DyMoDreamer: World Modeling with Dynamic Modulation

144

0

0

29 Sep 2025

Polychromic Objectives for Reinforcement Learning

Polychromic Objectives for Reinforcement Learning

Jubayer Ibn Hamid

Ifdita Hasan Orney

104

1

0

29 Sep 2025

Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption

Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption

310

0

0

29 Sep 2025

Safe In-Context Reinforcement Learning

Safe In-Context Reinforcement Learning

Alper Kamil Bozkurt

Shangtong Zhang

135

1

0

29 Sep 2025

An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms

An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms

153

0

0

28 Sep 2025

Bridging Discrete and Continuous RL: Stable Deterministic Policy Gradient with Martingale Characterization

Bridging Discrete and Continuous RL: Stable Deterministic Policy Gradient with Martingale Characterization

97

0

0

28 Sep 2025

Continuous-Time Reinforcement Learning for Asset-Liability Management

Continuous-Time Reinforcement Learning for Asset-Liability Management

76

0

0

27 Sep 2025

From Parameters to Behavior: Unsupervised Compression of the Policy Space

From Parameters to Behavior: Unsupervised Compression of the Policy Space

Davide Tenedini

Riccardo Zamboni

Marcello Restelli

136

1

0

26 Sep 2025

Functional Critics Are Essential in Off-Policy Actor-Critic: Provable Convergence and Efficient Exploration

Functional Critics Are Essential in Off-Policy Actor-Critic: Provable Convergence and Efficient Exploration

158

0

0

26 Sep 2025

The Use of the Simplex Architecture to Enhance Safety in Deep-Learning-Powered Autonomous Systems

The Use of the Simplex Architecture to Enhance Safety in Deep-Learning-Powered Autonomous Systems

Giorgio Maria Cicero

Alessandro Biondi

Giorgio Buttazzo

173

0

0

25 Sep 2025

Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning

Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning

200

0

0

25 Sep 2025

Analysis of approximate linear programming solution to Markov decision problem with log barrier function

Analysis of approximate linear programming solution to Markov decision problem with log barrier function

142

0

0

24 Sep 2025

Frictional Q-Learning

Frictional Q-Learning

153

0

0

24 Sep 2025

Memory-Augmented Potential Field Theory: A Framework for Adaptive Control in Non-Convex Domains

Memory-Augmented Potential Field Theory: A Framework for Adaptive Control in Non-Convex Domains

109

0

0

24 Sep 2025

AnySafe: Adapting Latent Safety Filters at Runtime via Safety Constraint Parameterization in the Latent Space

AnySafe: Adapting Latent Safety Filters at Runtime via Safety Constraint Parameterization in the Latent Space

Sankalp Agrawal

Kensuke Nakamura

Andrea V. Bajcsy

116

1

0

23 Sep 2025

Residual Off-Policy RL for Finetuning Behavior Cloning Policies

Residual Off-Policy RL for Finetuning Behavior Cloning Policies

Anusha Nagabandi

221

4

0

23 Sep 2025

SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration

SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration

177

0

0

23 Sep 2025

EigenSafe: A Spectral Framework for Learning-Based Stochastic Safety Filtering

EigenSafe: A Spectral Framework for Learning-Based Stochastic Safety Filtering

Chams E. Mballo

Claire J. Tomlin

111

0

0

22 Sep 2025

Fast Trajectory Planner with a Reinforcement Learning-based Controller for Robotic Manipulators

Fast Trajectory Planner with a Reinforcement Learning-based Controller for Robotic ManipulatorsEngineering applications of artificial intelligence (EAAI), 2025

Hamidreza Kasaei

112

0

0

22 Sep 2025

MCP: A Control-Theoretic Orchestration Framework for Synergistic Efficiency and Interpretability in Multimodal Large Language Models

MCP: A Control-Theoretic Orchestration Framework for Synergistic Efficiency and Interpretability in Multimodal Large Language Models

84

0

0

20 Sep 2025

HypeMARL: Multi-Agent Reinforcement Learning For High-Dimensional, Parametric, and Distributed Systems

HypeMARL: Multi-Agent Reinforcement Learning For High-Dimensional, Parametric, and Distributed Systems

Matteo Tomasetto

Francesco Braghin

136

0

0

20 Sep 2025

GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation

GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation

132

3

0

19 Sep 2025

Accelerating Atomic Fine Structure Determination with Graph Reinforcement Learning

Accelerating Atomic Fine Structure Determination with Graph Reinforcement Learning

J. C. Pickering

98

0

0

19 Sep 2025

Deep Learning Empowered Super-Resolution: A Comprehensive Survey and Future Prospects

Deep Learning Empowered Super-Resolution: A Comprehensive Survey and Future ProspectsProceedings of the IEEE (Proc. IEEE), 2025

Yonina C. Eldar

285

1

0

19 Sep 2025

Designing Latent Safety Filters using Pre-Trained Vision Models

Designing Latent Safety Filters using Pre-Trained Vision Models

Maxwell Astafyev

81

1

0

18 Sep 2025

Safe Reinforcement Learning using Action Projection: Safeguard the Policy or the Environment?

Safe Reinforcement Learning using Action Projection: Safeguard the Policy or the Environment?

Hannah Markgraf

Shamburaj Sawant

Hanna Krasowski

Matthias Althoff

156

0

0

16 Sep 2025

CORB-Planner: Corridor as Observations for RL Planning in High-Speed Flight

CORB-Planner: Corridor as Observations for RL Planning in High-Speed Flight

90

0

0

14 Sep 2025

1 2 3 4 5 6...94 95 96