ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,552 papers shown
FlowRL: Matching Reward Distributions for LLM Reasoning
FlowRL: Matching Reward Distributions for LLM Reasoning
Xuekai Zhu
Daixuan Cheng
D. Zhang
Hengli Li
Kaiyan Zhang
...
J. Gao
Xiaodong Liu
Bowen Zhou
Hongyuan Mei
Zhouhan Lin
LRM
246
6
0
18 Sep 2025
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
Yujun Zhou
Zhenwen Liang
Haolin Liu
Wenhao Yu
Kishan Panaganti
Linfeng Song
Dian Yu
Xiangliang Zhang
Haitao Mi
Dong Yu
177
14
0
18 Sep 2025
Sample Efficient Experience Replay in Non-stationary Environments
Sample Efficient Experience Replay in Non-stationary Environments
Tianyang Duan
Zongyuan Zhang
Songxiao Guo
Yuanye Zhao
Zheng Lin
...
Yi Liu
Dianxin Luan
Dong Huang
Heming Cui
Yong Cui
132
2
0
18 Sep 2025
SHaRe-RL: Structured, Interactive Reinforcement Learning for Contact-Rich Industrial Assembly Tasks
SHaRe-RL: Structured, Interactive Reinforcement Learning for Contact-Rich Industrial Assembly Tasks
Jannick Stranghöner
Philipp Hartmann
Marco Braun
Sebastian Wrede
Klaus Neumann
OffRL
104
3
0
17 Sep 2025
StableTracker: Learning to Stably Track Target via Differentiable Simulation
StableTracker: Learning to Stably Track Target via Differentiable Simulation
Fanxing Li
Shengyang Wang
Fangyu Sun
Shuyu Wu
Dexin Zuo
Wenxian Yu
Danping Zou
160
0
0
17 Sep 2025
Reinforcement Learning for Robotic Insertion of Flexible Cables in Industrial Settings
Reinforcement Learning for Robotic Insertion of Flexible Cables in Industrial Settings
Jeongwoo Park
Seabin Lee
Changmin Park
Wonjong Lee
Changjoo Nam
116
0
0
17 Sep 2025
SEG-Parking: Towards Safe, Efficient, and Generalizable Autonomous Parking via End-to-End Offline Reinforcement Learning
SEG-Parking: Towards Safe, Efficient, and Generalizable Autonomous Parking via End-to-End Offline Reinforcement Learning
Zewei Yang
Zengqi Peng
Jun Ma
OffRL
105
0
0
17 Sep 2025
Online Learning of Deceptive Policies under Intermittent Observation
Online Learning of Deceptive Policies under Intermittent Observation
Gokul Puthumanaillam
Ram Padmanabhan
Jose Fuentes
Nicole Cruz
Paulo Padrao
Ruben Hernandez
Hao Jiang
William E. Schafer
Leonardo Bobadilla
Melkior Ornik
OffRL
122
0
0
17 Sep 2025
Large Language Model-Empowered Decision Transformer for UAV-Enabled Data Collection
Large Language Model-Empowered Decision Transformer for UAV-Enabled Data Collection
Zhixion Chen
Jiangzhou Wang
Hyundong Shin
Arumugam Nallanathan
OffRL
106
0
0
17 Sep 2025
EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
Pukun Zhao
Longxiang Wang
Miaowei Wang
Chen Chen
Fanqing Zhou
Haojian Huang
205
0
0
16 Sep 2025
Force-Modulated Visual Policy for Robot-Assisted Dressing with Arm Motions
Force-Modulated Visual Policy for Robot-Assisted Dressing with Arm Motions
Alexis Yihong Hao
Yufei Wang
Navin Sriram Ravie
Bharath Hegde
David Held
Zackory M. Erickson
113
0
0
16 Sep 2025
GRATE: a Graph transformer-based deep Reinforcement learning Approach for Time-efficient autonomous robot Exploration
GRATE: a Graph transformer-based deep Reinforcement learning Approach for Time-efficient autonomous robot Exploration
Haozhan Ni
Jingsong Liang
Chenyu He
Yuhong Cao
Guillaume Sartoretti
OffRL
144
0
0
16 Sep 2025
Empowering Multi-Robot Cooperation via Sequential World Models
Empowering Multi-Robot Cooperation via Sequential World Models
Zijie Zhao
Honglei Guo
Shengqian Chen
Kaixuan Xu
Bo Jiang
Yuanheng Zhu
Dongbin Zhao
212
3
0
16 Sep 2025
MEMBOT: Memory-Based Robot in Intermittent POMDP
MEMBOT: Memory-Based Robot in Intermittent POMDP
Youzhi Liang
Eyan Noronha
OffRL
84
0
0
14 Sep 2025
Mutual Information Tracks Policy Coherence in Reinforcement Learning
Mutual Information Tracks Policy Coherence in Reinforcement Learning
Cameron Reid
Wael Hafez
Amirhossein Nazeri
120
0
0
12 Sep 2025
Reinforcement learning for spin torque oscillator tasks
Reinforcement learning for spin torque oscillator tasks
J. Mojsiejuk
Sławomir Ziętek
W. Skowroñski
28
0
0
12 Sep 2025
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Runpeng Dai
Linfeng Song
Haolin Liu
Zhenwen Liang
Dian Yu
...
Zhaopeng Tu
R. Liu
Tong Zheng
Hongtu Zhu
Dong Yu
LRM
176
10
0
11 Sep 2025
Off Policy Lyapunov Stability in Reinforcement Learning
Off Policy Lyapunov Stability in Reinforcement Learning
Sarvan Gill
Daniela Constantinescu
95
0
0
11 Sep 2025
Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates
Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates
Zixin Zhang
James Avtges
Todd Murphey
121
0
0
10 Sep 2025
RAPID Quantum Detection and Demodulation of Covert Communications: Breaking the Noise Limit with Solid-State Spin Sensors
RAPID Quantum Detection and Demodulation of Covert Communications: Breaking the Noise Limit with Solid-State Spin Sensors
Amirhossein Taherpour
Abbas Taherpour
Tamer Khattab
110
0
0
09 Sep 2025
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
Long Li
Jiaran Hao
Jason Klein Liu
Zhijian Zhou
Yanting Miao
...
Wei Chu
Zhe Wang
Shirui Pan
Chao Qu
Yuan Qi
187
6
0
09 Sep 2025
Interactive Shaping of Granular Media Using Reinforcement Learning
Interactive Shaping of Granular Media Using Reinforcement Learning
Benedikt Kreis
Malte Mosbach
Anny Ripke
Muhammad Ehsan Ullah
Sven Behnke
Maren Bennewitz
117
0
0
08 Sep 2025
Simulation Priors for Data-Efficient Deep Learning
Simulation Priors for Data-Efficient Deep Learning
Lenart Treven
Bhavya Sukhija
Jonas Rothfuss
Stelian Coros
Florian Dorfler
Andreas Krause
135
0
0
06 Sep 2025
TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models
TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models
Haechang Kim
Hao Chen
Can Li
Jong Min Lee
LLMAG
155
0
0
05 Sep 2025
DeGuV: Depth-Guided Visual Reinforcement Learning for Generalization and Interpretability in Manipulation
DeGuV: Depth-Guided Visual Reinforcement Learning for Generalization and Interpretability in Manipulation
Tien Pham
Xinyun Chi
Khang Nguyen
Manfred Huber
Angelo Cangelosi
OffRL
121
0
0
05 Sep 2025
Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning
Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning
Chengyandan Shen
Christoffer Sloth
OffRL
113
0
0
04 Sep 2025
Learning Multi-Stage Pick-and-Place with a Legged Mobile Manipulator
Learning Multi-Stage Pick-and-Place with a Legged Mobile ManipulatorIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Haichao Zhang
Haonan Yu
Le Zhao
Andrew Choi
Qinxun Bai
Yiqing Yang
Wei Xu
204
0
0
04 Sep 2025
Bootstrapping Reinforcement Learning with Sub-optimal Policies for Autonomous Driving
Bootstrapping Reinforcement Learning with Sub-optimal Policies for Autonomous Driving
Zhihao Zhang
Chengyang Peng
Ekim Yurtsever
Keith A. Redmill
88
1
0
04 Sep 2025
Reinforcement Learning for Robust Ageing-Aware Control of Li-ion Battery Systems with Data-Driven Formal Verification
Reinforcement Learning for Robust Ageing-Aware Control of Li-ion Battery Systems with Data-Driven Formal Verification
Rudi Coppola
Hovsep Touloujian
Pierfrancesco Ombrini
Manuel Mazo Jr
OffRL
76
0
0
04 Sep 2025
On Entropy Control in LLM-RL Algorithms
On Entropy Control in LLM-RL Algorithms
Han Shen
155
12
0
03 Sep 2025
Uncertainty-driven Adaptive Exploration
Uncertainty-driven Adaptive Exploration
Leonidas Bakopoulos
Georgios Chalkiadakis
184
0
0
03 Sep 2025
DUViN: Diffusion-Based Underwater Visual Navigation via Knowledge-Transferred Depth Features
DUViN: Diffusion-Based Underwater Visual Navigation via Knowledge-Transferred Depth Features
Jinghe Yang
Minh-Quan Le
Mingming Gong
Ye Pu
124
1
0
03 Sep 2025
Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control
Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control
Skand Peri
Akhil Perincherry
Bikram Pandit
Stefan Lee
135
0
0
01 Sep 2025
Adaptive Vehicle Speed Classification via BMCNN with Reinforcement Learning-Enhanced Acoustic Processing
Adaptive Vehicle Speed Classification via BMCNN with Reinforcement Learning-Enhanced Acoustic Processing
Yuli Zhang
Pengfei Fan
Ruiyuan Jiang
Hankang Gu
Dongyao Jia
Xinheng Wang
86
0
0
31 Aug 2025
Jacobian Exploratory Dual-Phase Reinforcement Learning for Dynamic Endoluminal Navigation of Deformable Continuum Robots
Jacobian Exploratory Dual-Phase Reinforcement Learning for Dynamic Endoluminal Navigation of Deformable Continuum Robots
Yu Tian
Chi Kit Ng
Hongliang Ren
99
0
0
30 Aug 2025
LLM-Driven Policy Diffusion: Enhancing Generalization in Offline Reinforcement Learning
LLM-Driven Policy Diffusion: Enhancing Generalization in Offline Reinforcement Learning
Hanping Zhang
Yuhong Guo
OffRL
178
0
0
30 Aug 2025
Machine Intelligence on the Edge: Interpretable Cardiac Pattern Localisation Using Reinforcement Learning
Machine Intelligence on the Edge: Interpretable Cardiac Pattern Localisation Using Reinforcement Learning
Haozhe Tian
Qiyu Rao
Nina Moutonnet
P. Ferraro
Danilo Mandic
91
0
0
29 Aug 2025
First Order Model-Based RL through Decoupled Backpropagation
First Order Model-Based RL through Decoupled Backpropagation
Joseph Amigo
Rooholla Khorrambakht
Elliot Chane-Sane
Nicolas Mansard
Ludovic Righetti
161
0
0
29 Aug 2025
Convergence of regularized agent-state-based Q-learning in POMDPs
Convergence of regularized agent-state-based Q-learning in POMDPs
Amit Sinha
Matthieu Geist
Aditya Mahajan
99
0
0
29 Aug 2025
Single Agent Robust Deep Reinforcement Learning for Bus Fleet Control
Single Agent Robust Deep Reinforcement Learning for Bus Fleet Control
Yifan Zhang
52
0
0
28 Aug 2025
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Divide, Discover, Deploy: Factorized Skill Learning with Symmetry and Style Priors
Rafael Cathomen
Mayank Mittal
Marin Vlastelica
Marco Hutter
151
2
0
27 Aug 2025
MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
Weikang Zhao
Xili Wang
Chengdi Ma
Lingbin Kong
Zhaohua Yang
Mingxiang Tuo
Xiaowei Shi
Yitao Zhai
Xunliang Cai
119
6
0
26 Aug 2025
Stability and Generalization for Bellman Residuals
Stability and Generalization for Bellman Residuals
Enoch H. Kang
Kyoungseok Jang
OffRL
113
0
0
26 Aug 2025
ANO : Faster is Better in Noisy Landscape
ANO : Faster is Better in Noisy Landscape
Adrien Kegreisz
ODL
381
0
0
25 Aug 2025
Convergence and Generalization of Anti-Regularization for Parametric Models
Convergence and Generalization of Anti-Regularization for Parametric Models
Dongseok Kim
Wonjun Jeong
Gisung Oh
231
0
0
24 Aug 2025
Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
Marco S. Tayar
Lucas K. de Oliveira
Juliano Negri
Thiago H. Segreto
Ricardo V. Godoy
Marcelo Becker
Marcelo Becker
164
1
0
22 Aug 2025
A Dynamical Systems Framework for Reinforcement Learning Safety and Robustness Verification
A Dynamical Systems Framework for Reinforcement Learning Safety and Robustness Verification
Ahmed Nasir
Abdelhafid Zenati
83
0
0
21 Aug 2025
Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning
Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning
Ardian Selmonaj
M. Strupl
Oleg Szehr
Alessandro Antonucci
156
0
0
21 Aug 2025
Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
Xiancheng Gao
Yufeng Shi
Wengang Zhou
Houqiang Li
OffRL
245
0
0
21 Aug 2025
Compute-Optimal Scaling for Value-Based Deep RL
Compute-Optimal Scaling for Value-Based Deep RL
Preston Fu
Oleh Rybkin
Zhiyuan Zhou
Michal Nauman
Pieter Abbeel
Sergey Levine
Aviral Kumar
OffRL
185
2
0
20 Aug 2025
Previous
123...567...909192
Next