ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,552 papers shown
Mind Your Entropy: From Maximum Entropy to Trajectory Entropy-Constrained RL
Mind Your Entropy: From Maximum Entropy to Trajectory Entropy-Constrained RL
Guojian Zhan
Likun Wang
Pengcheng Wang
Feihong Zhang
Jingliang Duan
Masayoshi Tomizuka
Shengbo Eben Li
78
0
0
25 Oct 2025
Toward Humanoid Brain-Body Co-design: Joint Optimization of Control and Morphology for Fall Recovery
Toward Humanoid Brain-Body Co-design: Joint Optimization of Control and Morphology for Fall Recovery
Bo Yue
Sheng Xu
Kui Jia
Guiliang Liu
AI4CE
160
1
0
25 Oct 2025
Computational Hardness of Reinforcement Learning with Partial $q^π$-Realizability
Computational Hardness of Reinforcement Learning with Partial qπq^πqπ-Realizability
Shayan Karimi
Xiaoqi Tan
155
0
0
24 Oct 2025
DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection
DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection
Tala Aljaafari
Varun Kanade
Philip Torr
Christian Schroeder de Witt
OODDOffRL
260
0
0
24 Oct 2025
GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation
GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation
Guangqi Jiang
Haoran Chang
Ri-Zhao Qiu
Yutong Liang
Mazeyu Ji
Jiyue Zhu
Zhao Dong
Xueyan Zou
Xiaolong Wang
3DGS
184
3
0
23 Oct 2025
Multi-Modal Decentralized Reinforcement Learning for Modular Reconfigurable Lunar Robots
Multi-Modal Decentralized Reinforcement Learning for Modular Reconfigurable Lunar Robots
Ashutosh Mishra
S. Santra
Elian Neppel
Edoardo M. Rossi Lombardi
Shamistan Karimov
Kentaro Uno
Kazuya Yoshida
84
0
0
23 Oct 2025
Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models
Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models
Mingen Li
Houjian Yu
Yixuan Huang
Youngjin Hong
Changhyun Choi
129
0
0
22 Oct 2025
A Communication-Efficient Decentralized Actor-Critic Algorithm
A Communication-Efficient Decentralized Actor-Critic Algorithm
Xiaoxing Ren
Nicola Bastianello
Thomas Parisini
Andreas A. Malikopoulos
105
0
0
22 Oct 2025
Continual Knowledge Adaptation for Reinforcement Learning
Continual Knowledge Adaptation for Reinforcement Learning
Jinwu Hu
Zihao Lian
Z. Wen
Chenghao Li
Guohao Chen
Xutao Wen
Bin Xiao
Mingkui Tan
CLLKELM
196
1
0
22 Oct 2025
SEA: Semantic Map Prediction for Active Exploration of Uncertain Areas
SEA: Semantic Map Prediction for Active Exploration of Uncertain Areas
Hongyu Ding
Xinyue Liang
Yudong Fang
You Wu
Jieqi Shi
Jing Huo
W. Li
Jing Wu
Yu-kun Lai
Yang Gao
157
0
0
22 Oct 2025
Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning
Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning
Fang Nan
Hao Ma
Qinghua Guan
Josie Hughes
Michael Muehlebach
Marco Hutter
OffRL
124
1
0
21 Oct 2025
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
Xiaohan Qin
Xiaoxing Wang
Ning Liao
Junchi Yan
128
0
0
21 Oct 2025
Actor-Free Continuous Control via Structurally Maximizable Q-Functions
Actor-Free Continuous Control via Structurally Maximizable Q-Functions
Yigit Korkmaz
Urvi Bhuwania
Ayush Jain
Erdem Bıyık
OffRL
117
0
0
21 Oct 2025
Heterogeneous Adversarial Play in Interactive Environments
Heterogeneous Adversarial Play in Interactive Environments
Manjie Xu
Xinyi Yang
Jiayu Zhan
Wei Liang
Chi Zhang
Yixin Zhu
153
0
0
21 Oct 2025
ADPO: Anchored Direct Preference Optimization
ADPO: Anchored Direct Preference Optimization
Wang Zixian
318
0
0
21 Oct 2025
ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing
ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing
Guanjie Cheng
Siyang Liu
Junqin Huang
Xinkui Zhao
Yin Wang
Mengying Zhu
Linghe Kong
Shuiguang Deng
117
0
0
20 Oct 2025
RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
Yuquan Xue
Guanxing Lu
Zhenyu Wu
Chuanrui Zhang
Bofang Jia
Zhengyi Gu
Yansong Tang
Ziwei Wang
205
0
0
20 Oct 2025
Provably Optimal Reinforcement Learning under Safety Filtering
Provably Optimal Reinforcement Learning under Safety Filtering
Donggeon David Oh
D. Nguyen
Haimin Hu
J. F. Fisac
OffRL
129
0
0
20 Oct 2025
D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks
D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks
Jundong Zhang
Yuhui Situ
Fanji Zhang
Rongji Deng
Tianqi Wei
OffRL
100
0
0
20 Oct 2025
Closing the Sim2Real Performance Gap in RL
Closing the Sim2Real Performance Gap in RL
Akhil S. Anand
Shambhuraj Sawant
Jasper Hoffmann
D. Reinhardt
S. Gros
OffRL
160
1
0
20 Oct 2025
Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks
Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks
Xinkai Wang
Beibei Li
Zerui Shao
Ao Liu
Shouling Ji
AAML
121
1
0
20 Oct 2025
Consistent Zero-Shot Imitation with Contrastive Goal Inference
Consistent Zero-Shot Imitation with Contrastive Goal Inference
Kathryn Wantlin
Chongyi Zheng
Benjamin Eysenbach
187
0
0
20 Oct 2025
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
Chengxiu Hua
Jiawen Gu
Yushun Tang
261
0
0
20 Oct 2025
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
183
0
0
17 Oct 2025
HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward
HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward
Yuhong Cao
Yizhuo Wang
Jingsong Liang
Shuhao Liao
Yifeng Zhang
Peizhuo Li
Guillaume Sartoretti
106
0
0
17 Oct 2025
ProSh: Probabilistic Shielding for Model-free Reinforcement Learning
ProSh: Probabilistic Shielding for Model-free Reinforcement Learning
Edwin Hamel-De le Court
Gaspard Ohlmann
Francesco Belardinelli
141
0
0
17 Oct 2025
OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning
OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning
Woo-Jin Ahn
Sang-Ryul Baek
Yong-Jun Lee
H. Choi
M. Lim
OffRL
104
0
0
17 Oct 2025
A Hard-Label Black-Box Evasion Attack against ML-based Malicious Traffic Detection Systems
A Hard-Label Black-Box Evasion Attack against ML-based Malicious Traffic Detection Systems
Zixuan Liu
Yi Zhao
Zhuotao Liu
Qi Li
Chuanpu Fu
Guangmeng Zhou
Ke Xu
AAML
106
0
0
16 Oct 2025
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
Kun Lei
Huanyu Li
Dongjie Yu
Zhenyu Wei
Lingxiao Guo
Zhennan Jiang
Ziyu Wang
Shiyu Liang
Huazhe Xu
OffRLVLM
350
5
0
16 Oct 2025
ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning
ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning
Roger Creus Castanyer
Faisal Mohamed
Pablo Samuel Castro
Cyrus Neary
Glen Berseth
OffRLLRMAI4CE
215
0
0
16 Oct 2025
SkyDreamer: Interpretable End-to-End Vision-Based Drone Racing with Model-Based Reinforcement Learning
SkyDreamer: Interpretable End-to-End Vision-Based Drone Racing with Model-Based Reinforcement Learning
Aderik Verraest
Stavrow A. Bahnam
Robin Ferede
Guido C. H. E de Croon
Christophe De Wagter
178
1
0
16 Oct 2025
ViTacGen: Robotic Pushing with Vision-to-Touch Generation
ViTacGen: Robotic Pushing with Vision-to-Touch GenerationIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Z. Wu
Yijiong Lin
Yongqiang Zhao
Xuyang Zhang
Zhuo Chen
Nathan Lepora
Shan Luo
149
1
0
15 Oct 2025
A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control
A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control
Nikita Kachaev
Daniil Zelezetsky
Egor Cherepanov
Alexey K. Kovelev
Aleksandr I. Panov
OffRL
143
2
0
15 Oct 2025
Transfer learning strategies for accelerating reinforcement-learning-based flow control
Transfer learning strategies for accelerating reinforcement-learning-based flow control
Saeed Salehi
AI4CE
121
0
0
15 Oct 2025
STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy Management
STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy Management
Huiliang Zhang
Di Wu
Arnaud Zinflou
Benoit Boulet
AI4CE
65
0
0
15 Oct 2025
Thompson Sampling via Fine-Tuning of LLMs
Thompson Sampling via Fine-Tuning of LLMs
Nicolas Menet
Aleksandar Terzić
Michael Hersche
Andreas Krause
Abbas Rahimi
181
0
0
15 Oct 2025
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
J. Obando-Ceron
Walter Mayor
Samuel Lavoie
Scott Fujimoto
Aaron Courville
Pablo Samuel Castro
147
1
0
15 Oct 2025
Bayesian Optimization for Dynamic Pricing and Learning
Bayesian Optimization for Dynamic Pricing and Learning
Anush Anand
Pranav Agrawal
Tejas Bodas
133
0
0
14 Oct 2025
Diffusion Models for Reinforcement Learning: Foundations, Taxonomy, and Development
Diffusion Models for Reinforcement Learning: Foundations, Taxonomy, and Development
Changfu Xu
Jianxiong Guo
Yuzhu Liang
Haiyang Huang
Haodong Zou
Xi Zheng
Shui Yu
Xiaowen Chu
Jiannong Cao
Tian-sheng Wang
OffRLAI4CE
208
0
0
14 Oct 2025
Finite-time Convergence Analysis of Actor-Critic with Evolving Reward
Finite-time Convergence Analysis of Actor-Critic with Evolving Reward
Rui Hu
Yu Chen
Longbo Huang
150
0
0
14 Oct 2025
Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication
Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication
Sami Khairy
Gabriel Mittag
Vishak Gopal
Ross Cutler
93
0
0
14 Oct 2025
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning
Guozheng Ma
Lu Li
Zilin Wang
Haoyu Wang
Shengchao Hu
Leszek Rutkowski
D. Tao
AI4CE
171
0
0
14 Oct 2025
Inclusive Fitness as a Key Step Towards More Advanced Social Behaviors in Multi-Agent Reinforcement Learning Settings
Inclusive Fitness as a Key Step Towards More Advanced Social Behaviors in Multi-Agent Reinforcement Learning Settings
Andries Rosseau
Raphael Avalos
Ann Nowé
89
0
0
14 Oct 2025
Heterogeneous RBCs via deep multi-agent reinforcement learning
Heterogeneous RBCs via deep multi-agent reinforcement learning
Federico Gabriele
Aldo Glielmo
Marco Taboga
75
1
0
14 Oct 2025
ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty
ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty
Chenliang Li
Junyu Leng
Jiaxiang Li
Youbang Sun
Shixiang Chen
Shahin Shahrampour
Alfredo García
112
0
0
13 Oct 2025
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
Murad Dawood
Usama Ahmed Siddiquie
Shahram Khorshidi
Maren Bennewitz
156
0
0
13 Oct 2025
Game-Theoretic Risk-Shaped Reinforcement Learning for Safe Autonomous Driving
Game-Theoretic Risk-Shaped Reinforcement Learning for Safe Autonomous Driving
Dong Hu
Fenqing Hu
Lidong Yang
Chao Huang
125
0
0
13 Oct 2025
Reinforced sequential Monte Carlo for amortised sampling
Reinforced sequential Monte Carlo for amortised sampling
Sanghyeok Choi
Sarthak Mittal
Victor Elvira
Jinkyoo Park
Nikolay Malkin
126
0
0
13 Oct 2025
Refinery: Active Fine-tuning and Deployment-time Optimization for Contact-Rich Policies
Refinery: Active Fine-tuning and Deployment-time Optimization for Contact-Rich Policies
Bingjie Tang
Iretiayo Akinola
Jie Xu
Bowen Wen
Dieter Fox
Gaurav Sukhatme
Fabio Ramos
Abhishek Gupta
Yashraj S. Narang
OffRL
97
0
0
13 Oct 2025
A Primer on SO(3) Action Representations in Deep Reinforcement Learning
A Primer on SO(3) Action Representations in Deep Reinforcement Learning
Martin Schuck
Sherif Samy
Angela P. Schoellig
101
0
0
13 Oct 2025
Previous
123456...909192
Next