ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,552 papers shown
Free Energy-Inspired Cognitive Risk Integration for AV Navigation in Pedestrian-Rich Environments
Free Energy-Inspired Cognitive Risk Integration for AV Navigation in Pedestrian-Rich Environments
Meiting Dang
Yanping Wu
Yafei Wang
Dezong Zhao
David Flynn
Chongfeng Wei
199
0
0
28 Jul 2025
TADT-CSA: Temporal Advantage Decision Transformer with Contrastive State Abstraction for Generative Recommendation
TADT-CSA: Temporal Advantage Decision Transformer with Contrastive State Abstraction for Generative Recommendation
Yantao Du
Tianyuan Liu
Yisha Li
Jingxin Liu
Lexi Gao
Xin Li
Haiyang Lu
Liyin Hong
OffRL
201
0
0
27 Jul 2025
FAST: Similarity-based Knowledge Transfer for Efficient Policy Learning
FAST: Similarity-based Knowledge Transfer for Efficient Policy Learning
Alessandro Capurso
Elia Piccoli
Davide Bacciu
121
0
0
27 Jul 2025
The Policy Cliff: A Theoretical Analysis of Reward-Policy Maps in Large Language Models
The Policy Cliff: A Theoretical Analysis of Reward-Policy Maps in Large Language Models
Xingcheng Xu
210
3
0
27 Jul 2025
Directly Learning Stock Trading Strategies Through Profit Guided Loss Functions
Directly Learning Stock Trading Strategies Through Profit Guided Loss Functions
Devroop Kar
Zimeng Lyu
Sheeraja Rajakrishnan
Hao Zhang
Alex Ororbia
Travis J. Desell
Daniel E. Krutz
AIFinAI4TS
199
0
0
25 Jul 2025
Extending Group Relative Policy Optimization to Continuous Control: A Theoretical Framework for Robotic Reinforcement Learning
Extending Group Relative Policy Optimization to Continuous Control: A Theoretical Framework for Robotic Reinforcement Learning
Rajat Khanda
Mohammad Baqar
Sambuddha Chakrabarti
Satyasaran Changdar
126
1
0
25 Jul 2025
Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
Amir Fard
Arnold X.-X. Yuan
127
0
0
25 Jul 2025
Observations Meet Actions: Learning Control-Sufficient Representations for Robust Policy Generalization
Observations Meet Actions: Learning Control-Sufficient Representations for Robust Policy Generalization
Yuliang Gu
H. Cao
Marco Caccamo
N. Hovakimyan
OffRLBDL
209
0
0
25 Jul 2025
HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization
HARLF: Hierarchical Reinforcement Learning and Lightweight LLM-Driven Sentiment Integration for Financial Portfolio Optimization
Benjamin Coriat
Eric Benhamou
AIFin
118
1
0
24 Jul 2025
Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning
Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning
Po-Yen Wu
Cheng-Yu Kuo
Y. Kadokawa
Takamitsu Matsubara
OffRLCLL
151
0
0
23 Jul 2025
LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks
LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks
Lijie Zheng
Ji He
Shih Yu Chang
Yulong Shen
Dusit Niyato
155
2
0
23 Jul 2025
Learning Temporal Abstractions via Variational Homomorphisms in Option-Induced Abstract MDPs
Learning Temporal Abstractions via Variational Homomorphisms in Option-Induced Abstract MDPs
Chang Li
Yaren Zhang
Haoran Lv
Qiong Cao
Chao Xue
Xiaodong He
OffRLLRM
210
0
0
22 Jul 2025
Multi-agent Reinforcement Learning for Robotized Coral Reef Sample Collection
Multi-agent Reinforcement Learning for Robotized Coral Reef Sample Collection
Daniel Correa
Tero Kaarlela
Jose Fuentes
Paulo Padrao
Alain Duran
Leonardo Bobadilla
107
1
0
22 Jul 2025
RAD: Retrieval High-quality Demonstrations to Enhance Decision-making
RAD: Retrieval High-quality Demonstrations to Enhance Decision-making
Lu Guo
Yixiang Shan
Zhengbang Zhu
Qifan Liang
Lichang Song
Ting Long
Weinan Zhang
Yi-Ju Chang
OffRL
207
0
0
21 Jul 2025
Mixture of Autoencoder Experts Guidance using Unlabeled and Incomplete Data for Exploration in Reinforcement Learning
Mixture of Autoencoder Experts Guidance using Unlabeled and Incomplete Data for Exploration in Reinforcement Learning
Elias Malomgré
Pieter Simoens
OffRL
154
0
0
21 Jul 2025
One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
Zijian Zhao
Sen Li
189
1
0
21 Jul 2025
Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
Hijaz Ahmad
Ehsan Sabouni
Alexander Wasilkoff
Param Budhraja
Zijian Guo
Songyuan Zhang
Chuchu Fan
Christos G. Cassandras
Wenchao Li
252
2
0
20 Jul 2025
Balancing Expressivity and Robustness: Constrained Rational Activations for Reinforcement Learning
Balancing Expressivity and Robustness: Constrained Rational Activations for Reinforcement Learning
Rafał Surdej
Michał Bortkiewicz
Alex Lewandowski
M. Ostaszewski
Clare Lyle
193
1
0
19 Jul 2025
Age of Information Minimization in UAV-Enabled Integrated Sensing and Communication Systems
Age of Information Minimization in UAV-Enabled Integrated Sensing and Communication Systems
Yu Bai
Yifan Zhang
Boxuan Xie
Zheng Chang
Yanru Zhang
Riku Jäntti
Zhu Han
174
0
0
18 Jul 2025
Signal Temporal Logic Compliant Co-design of Planning and Control
Signal Temporal Logic Compliant Co-design of Planning and Control
Manas Sashank Juvvi
Tushar Dilip Kurne
Vaishnavi J
Shishir Kolathaya
Pushpak Jagtap
248
1
0
17 Jul 2025
Relative Entropy Pathwise Policy Optimization
Relative Entropy Pathwise Policy Optimization
C. Voelcker
Axel Brunnbauer
Marcel Hussing
Michal Nauman
Pieter Abbeel
Eric Eaton
Radu Grosu
Amir-massoud Farahmand
Igor Gilitschenski
408
1
0
15 Jul 2025
ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
ILCL: Inverse Logic-Constraint Learning from Temporally Constrained DemonstrationsIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Minwoo Cho
Jaehwi Jang
Daehyung Park
229
0
0
15 Jul 2025
Real-Time Adaptive Motion Planning via Point Cloud-Guided, Energy-Based Diffusion and Potential Fields
Real-Time Adaptive Motion Planning via Point Cloud-Guided, Energy-Based Diffusion and Potential FieldsIEEE Robotics and Automation Letters (IEEE RA-L), 2025
Wondmgezahu Teshome
Kian Behzad
Octavia Camps
Michael Everett
Milad Siami
Mario Sznaier
DiffM
264
0
0
12 Jul 2025
Optimistic Exploration for Risk-Averse Constrained Reinforcement Learning
Optimistic Exploration for Risk-Averse Constrained Reinforcement Learning
J. McCarthy
Radu Marinescu
Elizabeth M. Daly
Ivana Dusparic
180
0
0
11 Jul 2025
Reinforcement Learning with Action Chunking
Reinforcement Learning with Action Chunking
Qiyang Li
Zhiyuan Zhou
Sergey Levine
OffRLOnRL
411
24
0
10 Jul 2025
"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents
"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents
Giovanni Dispoto
Paolo Bonetti
Marcello Restelli
OffRL
234
0
0
10 Jul 2025
Growing Trees with an Agent: Accelerating RRTs with Learned, Multi-Step Episodic Exploration
Growing Trees with an Agent: Accelerating RRTs with Learned, Multi-Step Episodic Exploration
Xinyu Wu
OffRL
147
0
0
09 Jul 2025
Q-STAC: Q-Guided Stein Variational Model Predictive Actor-Critic
Q-STAC: Q-Guided Stein Variational Model Predictive Actor-Critic
Shizhe Cai
Zeya Yin
Jayadeep Jacob
Fabio Ramos
BDL
188
0
0
09 Jul 2025
2048: Reinforcement Learning in a Delayed Reward Environment
2048: Reinforcement Learning in a Delayed Reward Environment
Prady Saligram
Tanvir Bhathal
Robby Manihani
OffRL
213
1
0
07 Jul 2025
Planning under Uncertainty to Goal Distributions
Planning under Uncertainty to Goal Distributions
Adam Conkey
Tucker Hermans
388
3
0
01 Jul 2025
Active Inference AI Systems for Scientific Discovery
Active Inference AI Systems for Scientific Discovery
Karthik Duraisamy
AI4CELRM
442
1
0
26 Jun 2025
Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design
Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design
Hampus Gummesson Svensson
Ola Engkvist
J. Janet
C. Tyrchan
M. Chehreghani
OffRL
351
0
0
26 Jun 2025
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning
Prajwal Koirala
Cody Fleming
OffRL
323
5
0
26 Jun 2025
Autonomous Cyber Resilience via a Co-Evolutionary Arms Race within a Fortified Digital Twin Sandbox
Autonomous Cyber Resilience via a Co-Evolutionary Arms Race within a Fortified Digital Twin Sandbox
Malikussaid
Sutiyo
196
0
0
25 Jun 2025
TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design
TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design
Geonwoo Cho
Jaegyun Im
Jihwan Lee
Hojun Yi
Sejin Kim
Sundong Kim
246
0
0
24 Jun 2025
DRARL: Disengagement-Reason-Augmented Reinforcement Learning for Efficient Improvement of Autonomous Driving Policy
DRARL: Disengagement-Reason-Augmented Reinforcement Learning for Efficient Improvement of Autonomous Driving Policy
Weitao Zhou
Bo Zhang
Zhong Cao
X. Li
Qian Cheng
Chunyang Liu
Y. Zhang
Diange Yang
187
2
0
20 Jun 2025
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
Kosuke Nakanishi
Akihiro Kubo
Yuji Yasui
Shin Ishii
AAMLOffRL
227
0
0
20 Jun 2025
Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning
Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning
Marco Jiralerspong
E. Derman
Danilo Vucetic
Nikolay Malkin
Bilun Sun
Tianyu Zhang
Pierre-Luc Bacon
Gauthier Gidel
OffRL
328
1
0
20 Jun 2025
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Guozheng Ma
Lu Li
Zilin Wang
Li Shen
Pierre-Luc Bacon
Dacheng Tao
OffRL
184
6
0
20 Jun 2025
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement Learning
Robust Dynamic Material Handling via Adaptive Constrained Evolutionary Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Chengpeng Hu
Ziming Wang
Bo Yuan
Jialin Liu
Chengqi Zhang
Xin Yao
211
0
0
20 Jun 2025
GoalLadder: Incremental Goal Discovery with Vision-Language Models
GoalLadder: Incremental Goal Discovery with Vision-Language Models
Alexey Zakharov
Shimon Whiteson
252
1
0
19 Jun 2025
Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces
Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces
Jiamin He
A. Rupam Mahmood
Martha White
109
0
0
19 Jun 2025
Data-Driven Policy Mapping for Safe RL-based Energy Management Systems
Data-Driven Policy Mapping for Safe RL-based Energy Management SystemsEnergy Reports (Energy Rep.), 2025
Theo Zangato
A. Osmani
Pegah Alizadeh
165
1
0
19 Jun 2025
BIDA: A Bi-level Interaction Decision-making Algorithm for Autonomous Vehicles in Dynamic Traffic Scenarios
BIDA: A Bi-level Interaction Decision-making Algorithm for Autonomous Vehicles in Dynamic Traffic Scenarios
Liyang Yu
Tianyi Wang
Junfeng Jiao
Fengwu Shan
Hongqing Chu
B. Gao
145
2
0
19 Jun 2025
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
Andrew Wagenmaker
Mitsuhiko Nakamoto
Yunchu Zhang
S. Park
Waleed Yagoub
Anusha Nagabandi
Abhishek Gupta
Sergey Levine
OffRL
330
28
0
18 Jun 2025
Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior
Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior
Jiyi Wang
Jingyang Ke
Bo Dai
Anqi Wu
180
0
0
18 Jun 2025
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer
J. Obando-Ceron
Lu Li
Pierre-Luc Bacon
Glen Berseth
Aaron Courville
Pablo Samuel Castro
224
6
0
18 Jun 2025
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
CAWR: Corruption-Averse Advantage-Weighted Regression for Robust Policy Optimization
Ranting Hu
OffRL
307
0
0
18 Jun 2025
Reasoning with Exploration: An Entropy Perspective
Reasoning with Exploration: An Entropy Perspective
Daixuan Cheng
Shaohan Huang
Xuekai Zhu
Bo Dai
Wayne Xin Zhao
Zhenliang Zhang
Furu Wei
LRM
337
131
0
17 Jun 2025
Advancing Safe Mechanical Ventilation Using Offline RL With Hybrid Actions and Clinically Aligned Rewards
Advancing Safe Mechanical Ventilation Using Offline RL With Hybrid Actions and Clinically Aligned Rewards
Muhammad Hamza Yousuf
Jason Li
S. Vahdati
Raphael Theilen
Jakob Wittenstein
Jens Lehmann
OffRL
201
1
0
17 Jun 2025
Previous
123...789...909192
Next
Page 8 of 92
Pageof 92