Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,552 papers shown
Mind Your Entropy: From Maximum Entropy to Trajectory Entropy-Constrained RL
Guojian Zhan
Likun Wang
Pengcheng Wang
Feihong Zhang
Jingliang Duan
Masayoshi Tomizuka
Shengbo Eben Li
78
0
0
25 Oct 2025
Toward Humanoid Brain-Body Co-design: Joint Optimization of Control and Morphology for Fall Recovery
Bo Yue
Sheng Xu
Kui Jia
Guiliang Liu
AI4CE
160
1
0
25 Oct 2025
Computational Hardness of Reinforcement Learning with Partial
q
π
q^π
q
π
-Realizability
Shayan Karimi
Xiaoqi Tan
155
0
0
24 Oct 2025
DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection
Tala Aljaafari
Varun Kanade
Philip Torr
Christian Schroeder de Witt
OODD
OffRL
260
0
0
24 Oct 2025
GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation
Guangqi Jiang
Haoran Chang
Ri-Zhao Qiu
Yutong Liang
Mazeyu Ji
Jiyue Zhu
Zhao Dong
Xueyan Zou
Xiaolong Wang
3DGS
184
3
0
23 Oct 2025
Multi-Modal Decentralized Reinforcement Learning for Modular Reconfigurable Lunar Robots
Ashutosh Mishra
S. Santra
Elian Neppel
Edoardo M. Rossi Lombardi
Shamistan Karimov
Kentaro Uno
Kazuya Yoshida
84
0
0
23 Oct 2025
Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models
Mingen Li
Houjian Yu
Yixuan Huang
Youngjin Hong
Changhyun Choi
129
0
0
22 Oct 2025
A Communication-Efficient Decentralized Actor-Critic Algorithm
Xiaoxing Ren
Nicola Bastianello
Thomas Parisini
Andreas A. Malikopoulos
105
0
0
22 Oct 2025
Continual Knowledge Adaptation for Reinforcement Learning
Jinwu Hu
Zihao Lian
Z. Wen
Chenghao Li
Guohao Chen
Xutao Wen
Bin Xiao
Mingkui Tan
CLL
KELM
196
1
0
22 Oct 2025
SEA: Semantic Map Prediction for Active Exploration of Uncertain Areas
Hongyu Ding
Xinyue Liang
Yudong Fang
You Wu
Jieqi Shi
Jing Huo
W. Li
Jing Wu
Yu-kun Lai
Yang Gao
157
0
0
22 Oct 2025
Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning
Fang Nan
Hao Ma
Qinghua Guan
Josie Hughes
Michael Muehlebach
Marco Hutter
OffRL
124
1
0
21 Oct 2025
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
Xiaohan Qin
Xiaoxing Wang
Ning Liao
Junchi Yan
128
0
0
21 Oct 2025
Actor-Free Continuous Control via Structurally Maximizable Q-Functions
Yigit Korkmaz
Urvi Bhuwania
Ayush Jain
Erdem Bıyık
OffRL
117
0
0
21 Oct 2025
Heterogeneous Adversarial Play in Interactive Environments
Manjie Xu
Xinyi Yang
Jiayu Zhan
Wei Liang
Chi Zhang
Yixin Zhu
153
0
0
21 Oct 2025
ADPO: Anchored Direct Preference Optimization
Wang Zixian
318
0
0
21 Oct 2025
ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing
Guanjie Cheng
Siyang Liu
Junqin Huang
Xinkui Zhao
Yin Wang
Mengying Zhu
Linghe Kong
Shuiguang Deng
117
0
0
20 Oct 2025
RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
Yuquan Xue
Guanxing Lu
Zhenyu Wu
Chuanrui Zhang
Bofang Jia
Zhengyi Gu
Yansong Tang
Ziwei Wang
205
0
0
20 Oct 2025
Provably Optimal Reinforcement Learning under Safety Filtering
Donggeon David Oh
D. Nguyen
Haimin Hu
J. F. Fisac
OffRL
129
0
0
20 Oct 2025
D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks
Jundong Zhang
Yuhui Situ
Fanji Zhang
Rongji Deng
Tianqi Wei
OffRL
100
0
0
20 Oct 2025
Closing the Sim2Real Performance Gap in RL
Akhil S. Anand
Shambhuraj Sawant
Jasper Hoffmann
D. Reinhardt
S. Gros
OffRL
160
1
0
20 Oct 2025
Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks
Xinkai Wang
Beibei Li
Zerui Shao
Ao Liu
Shouling Ji
AAML
121
1
0
20 Oct 2025
Consistent Zero-Shot Imitation with Contrastive Goal Inference
Kathryn Wantlin
Chongyi Zheng
Benjamin Eysenbach
187
0
0
20 Oct 2025
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
Chengxiu Hua
Jiawen Gu
Yushun Tang
261
0
0
20 Oct 2025
Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
183
0
0
17 Oct 2025
HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward
Yuhong Cao
Yizhuo Wang
Jingsong Liang
Shuhao Liao
Yifeng Zhang
Peizhuo Li
Guillaume Sartoretti
106
0
0
17 Oct 2025
ProSh: Probabilistic Shielding for Model-free Reinforcement Learning
Edwin Hamel-De le Court
Gaspard Ohlmann
Francesco Belardinelli
141
0
0
17 Oct 2025
OffSim: Offline Simulator for Model-based Offline Inverse Reinforcement Learning
Woo-Jin Ahn
Sang-Ryul Baek
Yong-Jun Lee
H. Choi
M. Lim
OffRL
104
0
0
17 Oct 2025
A Hard-Label Black-Box Evasion Attack against ML-based Malicious Traffic Detection Systems
Zixuan Liu
Yi Zhao
Zhuotao Liu
Qi Li
Chuanpu Fu
Guangmeng Zhou
Ke Xu
AAML
106
0
0
16 Oct 2025
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
Kun Lei
Huanyu Li
Dongjie Yu
Zhenyu Wei
Lingxiao Guo
Zhennan Jiang
Ziyu Wang
Shiyu Liang
Huazhe Xu
OffRL
VLM
350
5
0
16 Oct 2025
ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning
Roger Creus Castanyer
Faisal Mohamed
Pablo Samuel Castro
Cyrus Neary
Glen Berseth
OffRL
LRM
AI4CE
215
0
0
16 Oct 2025
SkyDreamer: Interpretable End-to-End Vision-Based Drone Racing with Model-Based Reinforcement Learning
Aderik Verraest
Stavrow A. Bahnam
Robin Ferede
Guido C. H. E de Croon
Christophe De Wagter
178
1
0
16 Oct 2025
ViTacGen: Robotic Pushing with Vision-to-Touch Generation
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Z. Wu
Yijiong Lin
Yongqiang Zhao
Xuyang Zhang
Zhuo Chen
Nathan Lepora
Shan Luo
149
1
0
15 Oct 2025
A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control
Nikita Kachaev
Daniil Zelezetsky
Egor Cherepanov
Alexey K. Kovelev
Aleksandr I. Panov
OffRL
143
2
0
15 Oct 2025
Transfer learning strategies for accelerating reinforcement-learning-based flow control
Saeed Salehi
AI4CE
121
0
0
15 Oct 2025
STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy Management
Huiliang Zhang
Di Wu
Arnaud Zinflou
Benoit Boulet
AI4CE
65
0
0
15 Oct 2025
Thompson Sampling via Fine-Tuning of LLMs
Nicolas Menet
Aleksandar Terzić
Michael Hersche
Andreas Krause
Abbas Rahimi
181
0
0
15 Oct 2025
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
J. Obando-Ceron
Walter Mayor
Samuel Lavoie
Scott Fujimoto
Aaron Courville
Pablo Samuel Castro
147
1
0
15 Oct 2025
Bayesian Optimization for Dynamic Pricing and Learning
Anush Anand
Pranav Agrawal
Tejas Bodas
133
0
0
14 Oct 2025
Diffusion Models for Reinforcement Learning: Foundations, Taxonomy, and Development
Changfu Xu
Jianxiong Guo
Yuzhu Liang
Haiyang Huang
Haodong Zou
Xi Zheng
Shui Yu
Xiaowen Chu
Jiannong Cao
Tian-sheng Wang
OffRL
AI4CE
208
0
0
14 Oct 2025
Finite-time Convergence Analysis of Actor-Critic with Evolving Reward
Rui Hu
Yu Chen
Longbo Huang
150
0
0
14 Oct 2025
Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication
Sami Khairy
Gabriel Mittag
Vishak Gopal
Ross Cutler
93
0
0
14 Oct 2025
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning
Guozheng Ma
Lu Li
Zilin Wang
Haoyu Wang
Shengchao Hu
Leszek Rutkowski
D. Tao
AI4CE
171
0
0
14 Oct 2025
Inclusive Fitness as a Key Step Towards More Advanced Social Behaviors in Multi-Agent Reinforcement Learning Settings
Andries Rosseau
Raphael Avalos
Ann Nowé
89
0
0
14 Oct 2025
Heterogeneous RBCs via deep multi-agent reinforcement learning
Federico Gabriele
Aldo Glielmo
Marco Taboga
75
1
0
14 Oct 2025
ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty
Chenliang Li
Junyu Leng
Jiaxiang Li
Youbang Sun
Shixiang Chen
Shahin Shahrampour
Alfredo García
112
0
0
13 Oct 2025
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
Murad Dawood
Usama Ahmed Siddiquie
Shahram Khorshidi
Maren Bennewitz
156
0
0
13 Oct 2025
Game-Theoretic Risk-Shaped Reinforcement Learning for Safe Autonomous Driving
Dong Hu
Fenqing Hu
Lidong Yang
Chao Huang
125
0
0
13 Oct 2025
Reinforced sequential Monte Carlo for amortised sampling
Sanghyeok Choi
Sarthak Mittal
Victor Elvira
Jinkyoo Park
Nikolay Malkin
126
0
0
13 Oct 2025
Refinery: Active Fine-tuning and Deployment-time Optimization for Contact-Rich Policies
Bingjie Tang
Iretiayo Akinola
Jie Xu
Bowen Wen
Dieter Fox
Gaurav Sukhatme
Fabio Ramos
Abhishek Gupta
Yashraj S. Narang
OffRL
97
0
0
13 Oct 2025
A Primer on SO(3) Action Representations in Deep Reinforcement Learning
Martin Schuck
Sherif Samy
Angela P. Schoellig
101
0
0
13 Oct 2025
Previous
1
2
3
4
5
6
...
90
91
92
Next