v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018

Pieter Abbeel

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,552 papers shown

Reinforcement Learning for Charging Optimization of Inhomogeneous Dicke Quantum Batteries

15 Nov 2025

From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training

...

142

11 Nov 2025

PrefPoE: Advantage-Guided Preference Fusion for Learning Where to Explore

125

11 Nov 2025

Dynamic Sparsity: Challenging Common Sparsity Assumptions for Learning World Models in Robotic Reinforcement Learning Benchmarks

200

11 Nov 2025

SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories

370

11 Nov 2025

LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem ExplorationApplied Soft Computing (ASC), 2017

132

11 Nov 2025

Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning

199

11 Nov 2025

PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork

104

10 Nov 2025

Secure Low-altitude Maritime Communications via Intelligent JammingScience China Information Sciences (Sci. China Inf. Sci.), 2025

110

10 Nov 2025

Controllable Flow Matching for Online Reinforcement Learning

144

10 Nov 2025

Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training

180

10 Nov 2025

Rapidly Learning Soft Robot Control via Implicit Time-Stepping

Andrew Choi

Dezhong Tong

10 Nov 2025

What Makes Reasoning Invalid: Echo Reflection Mitigation for Large Language Models

237

09 Nov 2025

Guardian-regularized Safe Offline Reinforcement Learning for Smart Weaning of Mechanical Circulatory Devices

156

08 Nov 2025

Gentle Manipulation Policy Learning via Demonstrations from VLM Planned Atomic Skills

487

08 Nov 2025

Towards Personalized Quantum Federated Learning for Anomaly DetectionIEEE Transactions on Network Science and Engineering (IEEE TNS&E), 2025

Ratun Rahman

Sina shaham

Dinh C. Nguyen

164

08 Nov 2025

SAD-Flower: Flow Matching for Safe, Admissible, and Dynamically Consistent Planning

189

07 Nov 2025

Multi-agent Coordination via Flow Matching

Dongsu Lee

Daehee Lee

Amy Zhang

130

07 Nov 2025

On Flow Matching KL Divergence

330

07 Nov 2025

Blind Inverse Game Theory: Jointly Decoding Rewards and Rationality in Entropy-Regularized Competitive Games

Hamza Virk

Sandro Amaglobeli

Zuhayr Syed

100

07 Nov 2025

Distributionally Robust Self Paced Curriculum Reinforcement Learning

497

07 Nov 2025

ReGen: Generative Robot Simulation via Inverse DesignInternational Conference on Learning Representations (ICLR), 2025

172

06 Nov 2025

Can Context Bridge the Reality Gap? Sim-to-Real Transfer of Context-Aware Policies

142

06 Nov 2025

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

134

06 Nov 2025

Periodic Skill Discovery

334

05 Nov 2025

Optimizing Multi-Lane Intersection Performance in Mixed Autonomy Environments

Manonmani Sekar

Nasim Nezamoddini

189

04 Nov 2025

Natural-gas storage modelling by deep reinforcement learning

Tiziano Balaconi

Aldo Glielmo

Marco Taboga

04 Nov 2025

Automated Reward Design for Gran Turismo

209

03 Nov 2025

Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization

Ziqi Wang

Jiashun Liu

L. Pan

234

03 Nov 2025

Clustering-Based Weight Orthogonalization for Stabilizing Deep Reinforcement LearningIEEE International Joint Conference on Neural Network (IJCNN), 2025

130

02 Nov 2025

SLAP: Shortcut Learning for Abstract Planning

129

02 Nov 2025

Bootstrap Off-policy with World Model

418

01 Nov 2025

Learning Soft Robotic Dynamics with Active Exploration

Robert K. Katzschmann

141

31 Oct 2025

Asynchronous Risk-Aware Multi-Agent Packet Routing for Ultra-Dense LEO Satellite Networks

100

31 Oct 2025

Morphology-Aware Graph Reinforcement Learning for Tensegrity Robot Locomotion

104

30 Oct 2025

Towards Reinforcement Learning Based Log Loading Automation

30 Oct 2025

SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation

...

212

30 Oct 2025

Real-DRL: Teach and Learn in Reality

135

30 Oct 2025

Navigation in a Three-Dimensional Urban Flow using Deep Reinforcement Learning

Federica Tonti

Ricardo Vinuesa

29 Oct 2025

Sim-to-Real Gentle Manipulation of Deformable and Fragile Objects with Stress-Guided Reinforcement Learning

134

29 Oct 2025

Dense and Diverse Goal Coverage in Multi Goal Reinforcement Learning

Sagalpreet Singh

Rishi Saket

A. Raghuveer

115

29 Oct 2025

Off-policy Reinforcement Learning with Model-based Exploration Augmentation

173

29 Oct 2025

Sample-efficient and Scalable Exploration in Continuous-Time RL

140

28 Oct 2025

Survey and Tutorial of Reinforcement Learning Methods in Process Systems Engineering

Maximilian Bloor

M. Mowbray

Ehecatl Antonio del Rio Chanona

Calvin Tsay

OffRL

132

28 Oct 2025

Multi-Agent Conditional Diffusion Model with Mean Field Communication as Wireless Resource Allocation Planner

160

27 Oct 2025

Human-Like Goalkeeping in a Realistic Football Simulation: a Sample-Efficient Reinforcement Learning Approach

Alessandro Sestini

Joakim Bergdahl

Jean-Philippe Barrette-LaPierre

170

27 Oct 2025

TARC: Time-Adaptive Robotic Control

113

27 Oct 2025

FlowCritic: Bridging Value Estimation with Flow Matching in Reinforcement Learning

127

26 Oct 2025

Mind Your Entropy: From Maximum Entropy to Trajectory Entropy-Constrained RL

25 Oct 2025

STAR-RIS-assisted Collaborative Beamforming for Low-altitude Wireless Networks

25 Oct 2025