Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,552 papers shown
Compute-Optimal Scaling for Value-Based Deep RL
Preston Fu
Oleh Rybkin
Zhiyuan Zhou
Michal Nauman
Pieter Abbeel
Sergey Levine
Aviral Kumar
OffRL
185
2
0
20 Aug 2025
MAVIS: Multi-Objective Alignment via Value-Guided Inference-Time Search
Jeremy Carleton
Debajoy Mukherjee
Srinivas Shakkottai
D. Kalathil
218
1
0
19 Aug 2025
FedRAIN-Lite: Federated Reinforcement Algorithms for Improving Idealised Numerical Weather and Climate Models
Pritthijit Nath
Sebastian Schemm
Henry Moss
Peter Haynes
Emily Shuckburgh
Mark Webb
AI4CE
127
0
0
19 Aug 2025
Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving
Dianzhao Li
Ostap Okhrin
224
0
0
19 Aug 2025
CAMAR: Continuous Actions Multi-Agent Routing
Artem Pshenitsyn
Aleksandr I. Panov
A. Skrynnik
138
0
0
18 Aug 2025
Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey
Rui Shao
W. Li
Lingsen Zhang
Renshan Zhang
Zhiyang Liu
Ran Chen
Liqiang Nie
LM&Ro
247
29
0
18 Aug 2025
Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Ahmet H. Güzel
Ilija Bogunovic
Jack Parker-Holder
OffRL
OnRL
219
0
0
17 Aug 2025
Contact-Rich and Deformable Foot Modeling for Locomotion Control of the Human Musculoskeletal System
Haixin Gong
Chen Zhang
Yanan Sui
64
0
0
16 Aug 2025
Beyond Fixed Morphologies: Learning Graph Policies with Trust Region Compensation in Variable Action Spaces
Thomas Gallien
115
0
0
16 Aug 2025
Fusing Rewards and Preferences in Reinforcement Learning
Sadegh Khorasani
Saber Salehkaleybar
Negar Kiyavash
Matthias Grossglauser
155
1
0
15 Aug 2025
ETTRL: Balancing Exploration and Exploitation in LLM Test-Time Reinforcement Learning Via Entropy Mechanism
Jia Liu
ChangYi He
YingQiao Lin
M. Yang
FeiYang Shen
Shaoguo Liu
166
10
0
15 Aug 2025
A learning-driven automatic planning framework for proton PBS treatments of H&N cancers
Qingqing Wang
Liqiang Xiao
Chang Chang
149
0
0
14 Aug 2025
Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
Wenlong Liang
Rui Zhou
Yang Ma
Bing Zhang
Songlin Li
Yijia Liao
Ping Kuang
LM&Ro
3DV
AI4CE
170
9
0
14 Aug 2025
GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
Kelin Yu
Sheng Zhang
Harshit Soora
Furong Huang
Heng Huang
Erfaun Noorani
Ruohan Gao
VGen
94
4
0
14 Aug 2025
Towards Safe Imitation Learning via Potential Field-Guided Flow Matching
Haoran Ding
Anqing Duan
Zezhou Sun
Leonel Rozo
Noémie Jaquier
Dezhen Song
Yoshihiko Nakamura
140
0
0
12 Aug 2025
SegDAC: Improving Visual Reinforcement Learning by Extracting Dynamic Object-Centric Representations from Pretrained Vision Models
Alexandre Brown
Glen Berseth
VLM
207
0
0
12 Aug 2025
Sparsity-Driven Plasticity in Multi-Task Reinforcement Learning
Aleksandar Todorov
Juan Cardenas-Cartagena
Rafael F. Cunha
Marco Zullich
Matthia Sabatelli
CLL
140
1
0
09 Aug 2025
Offline-to-Online Reinforcement Learning with Classifier-Free Diffusion Generation
Xiao Huang
Xu Liu
Enze Zhang
T. Yu
Shuai Li
OffRL
OnRL
193
0
0
09 Aug 2025
Learning Causal Structure Distributions for Robust Planning
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Alejandro Murillo-Gonzalez
Junhong Xu
Lantao Liu
CML
202
1
0
08 Aug 2025
Reparameterization Proximal Policy Optimization
Hai Zhong
Xun Wang
Zhuoran Li
Longbo Huang
185
0
0
08 Aug 2025
DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
Rui Yu
Xianghang Zhang
Runkai Zhao
HuaiCheng Yan
Meng Wang
89
4
0
07 Aug 2025
Uncertainty-aware Predict-Then-Optimize Framework for Equitable Post-Disaster Power Restoration
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Lin Jiang
Dahai Yu
Rongchao Xu
Tian Tang
Guang Wang
130
1
0
06 Aug 2025
GTPO and GRPO-S: Token and Sequence-Level Reward Shaping with Policy Entropy
Hongze Tan
Jianfei Pan
Jinghao Lin
Tao Chen
Zhihang Zheng
Zhihao Tang
HaiHua Yang
270
15
0
06 Aug 2025
Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain
Wafeeq Jaleel
Md Ragib Rownak
Athar Hanif
Sidra Ghayour Bhatti
Qadeer Ahmed
84
0
0
06 Aug 2025
CogniPlan: Uncertainty-Guided Path Planning with Conditional Generative Layout Prediction
Yizhuo Wang
Haodong He
Jingsong Liang
Yuhong Cao
Ritabrata Chakraborty
Guillaume Sartoretti
118
2
0
05 Aug 2025
Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies
Yi Ma
Hongyao Tang
Chenjun Xiao
Yaodong Yang
Wei Wei
Jianye Hao
Jiye Liang
OffRL
178
0
0
05 Aug 2025
Reinforcement Learning for Target Zone Blood Glucose Control
David H. Mguni
Jing Dong
Wanrong Yang
Ziquan Liu
Muhammad Salman Haleem
Baoxiang Wang
OffRL
OOD
50
0
0
05 Aug 2025
Computationally efficient Gauss-Newton reinforcement learning for model predictive control
D. Brandner
Sebastien Gros
Sergio Lucia
128
0
0
04 Aug 2025
Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning
Zeyu Zhao
Yueling Che
Kaichen Liu
Jian Li
Junmei Yao
OffRL
140
0
0
04 Aug 2025
Decomposing the Entropy-Performance Exchange: The Missing Keys to Unlocking Effective Reinforcement Learning
Jia Deng
Jie Chen
Zhipeng Chen
Wayne Xin Zhao
Ji-Rong Wen
LRM
153
9
0
04 Aug 2025
Is Exploration or Optimization the Problem for Deep Reinforcement Learning?
Glen Berseth
OffRL
154
1
0
02 Aug 2025
MoRe-ERL: Learning Motion Residuals using Episodic Reinforcement Learning
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Xi Huang
Hongyi Zhou
Ge Li
Yucheng Tang
Weiran Liao
B. Hein
Tamim Asfour
Rudolf Lioutikov
159
1
0
02 Aug 2025
OID-PPO: Optimal Interior Design using Proximal Policy Optimization by Transforming Design Guidelines into Reward Functions
Chanyoung Yoon
Sangbong Yoo
Soobin Yim
Chansoo Kim
Yun Jang
53
0
0
01 Aug 2025
Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations
Yuki Shirai
Kei Ota
Devesh K. Jha
Diego Romeres
241
0
0
01 Aug 2025
Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts
IEEE Transactions on Mobile Computing (IEEE TMC), 2025
Jin Yang
Qiong Wu
Zhiying Feng
Zhi Zhou
Deke Guo
Xu Chen
136
3
0
01 Aug 2025
Learning Network Dismantling Without Handcrafted Inputs
Haozhe Tian
P. Ferraro
Robert Shorten
Mahdi Jalili
Homayoun Hamedmoghadam
GNN
180
0
0
01 Aug 2025
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
Jianqiang Xiao
Yuexuan Sun
Yixin Shao
Boxi Gan
Rongqiang Liu
Yanjing Wu
Weili Gua
Xiang Deng
284
0
0
01 Aug 2025
Directional Ensemble Aggregation for Actor-Critics
Nicklas Werge
Yi-Shan Wu
Bahareh Tasdighi
M. Kandemir
OffRL
184
0
0
31 Jul 2025
One-Step Flow Policy Mirror Descent
Tianyi Chen
Haitong Ma
Na Li
Kai Wang
Bo Dai
258
1
0
31 Jul 2025
Personalized Education with Ranking Alignment Recommendation
Haipeng Liu
Y. Liu
Ting Long
AI4Ed
136
0
0
31 Jul 2025
Benchmarking Massively Parallelized Multi-Task Reinforcement Learning for Robotics Tasks
Vira Joshi
Zifan Xu
Bo Liu
Peter Stone
Amy Zhang
OffRL
275
6
0
31 Jul 2025
Efficient Differentially Private Fine-Tuning of LLMs via Reinforcement Learning
Afshin Khadangi
Amir Sartipi
Xiaohui Wu
Ramin Bahmani
Gilbert Fridgen
140
0
0
30 Jul 2025
Learning to Prune Branches in Modern Tree-Fruit Orchards
IEEE International Conference on Robotics and Automation (ICRA), 2025
Abhinav Jain
Cindy Grimm
Stefan Lee
90
0
0
30 Jul 2025
On Policy Stochasticity in Mutual Information Optimal Control of Linear Systems
Shoju Enami
Kenji Kashima
106
0
0
29 Jul 2025
Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics
Leonard Hinckeldey
Elliot Fosong
Elle Miller
Rimvydas Rubavicius
Trevor A. McInroe
Patricia Wollstadt
Christiane B. Wiebel-Herboth
Subramanian Ramamoorthy
Stefano V. Albrecht
156
0
0
29 Jul 2025
DeepGo: Predictive Directed Greybox Fuzzing
Network and Distributed System Security Symposium (NDSS), 2025
Peihong Lin
Pengfei Wang
Xu Zhou
Wei Xie
Gen Zhang
Kai Lu
278
9
0
29 Jul 2025
MoDeSuite: Robot Learning Task Suite for Benchmarking Mobile Manipulation with Deformable Objects
Yuying Zhang
K. Luck
Francesco Verdoja
Ville Kyrki
Joni Pajarinen
178
0
0
29 Jul 2025
Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRL
IEEE Transactions on Communications (IEEE Trans. Commun.), 2025
Hussein A. Ammar
R. Adve
S. Shahbazpanahi
G. Boudreau
Israfil Bahceci
98
0
0
28 Jul 2025
Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
International Conference on Learning Representations (ICLR), 2025
Saket Tiwari
Omer Gottesman
George Konidaris
226
3
0
28 Jul 2025
Free Energy-Inspired Cognitive Risk Integration for AV Navigation in Pedestrian-Rich Environments
Meiting Dang
Yanping Wu
Yafei Wang
Dezong Zhao
David Flynn
Chongfeng Wei
190
0
0
28 Jul 2025
Previous
1
2
3
...
6
7
8
...
90
91
92
Next
Page 7 of 92
Page
of 92
Go