ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.07245
  4. Cited By
q-exponential family for policy optimization
v1v2v3 (latest)

q-exponential family for policy optimization

International Conference on Learning Representations (ICLR), 2024
14 August 2024
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
    OffRL
ArXiv (abs)PDFHTML

Papers citing "q-exponential family for policy optimization"

25 / 25 papers shown
Title
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value RegularizationInternational Conference on Learning Representations (ICLR), 2023
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
250
102
0
28 Mar 2023
The In-Sample Softmax for Offline Reinforcement Learning
The In-Sample Softmax for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
158
28
0
28 Feb 2023
Quasi-optimal Reinforcement Learning with Continuous Actions
Quasi-optimal Reinforcement Learning with Continuous ActionsInternational Conference on Learning Representations (ICLR), 2023
Yuhan Li
Wenzhuo Zhou
Ruoqing Zhu
OffRL
171
9
0
21 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Extreme Q-Learning: MaxEnt RL without EntropyInternational Conference on Learning Representations (ICLR), 2023
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
235
101
0
05 Jan 2023
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
A Policy-Guided Imitation Approach for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Haoran Xu
Li Jiang
Jianxiong Li
Xianyuan Zhan
OffRL
367
74
0
15 Oct 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-LearningInternational Conference on Learning Representations (ICLR), 2021
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
494
1,164
0
12 Oct 2021
Sparse Continuous Distributions and Fenchel-Young Losses
Sparse Continuous Distributions and Fenchel-Young Losses
André F. T. Martins
Marcos Vinícius Treviso
António Farinhas
P. Aguiar
Mário A. T. Figueiredo
Mathieu Blondel
Vlad Niculae
174
16
0
04 Aug 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
150
16
0
15 Jun 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Scott Fujimoto
S. Gu
OffRL
332
969
0
12 Jun 2021
On the Global Convergence Rates of Softmax Policy Gradient Methods
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
418
322
0
13 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
668
1,585
0
15 Apr 2020
Leverage the Average: an Analysis of KL Regularization in RL
Leverage the Average: an Analysis of KL Regularization in RL
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
246
46
0
31 Mar 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning LibraryNeural Information Processing Systems (NeurIPS), 2019
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
952
48,122
0
03 Dec 2019
Entropic Regularization of Markov Decision Processes
Entropic Regularization of Markov Decision Processes
Boris Belousov
Jan Peters
170
25
0
06 Jul 2019
Sparse Sequence-to-Sequence Models
Sparse Sequence-to-Sequence ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Ben Peters
Vlad Niculae
André F. T. Martins
TPM
668
243
0
14 May 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
420
285
0
18 Jan 2019
Understanding the impact of entropy on policy optimization
Understanding the impact of entropy on policy optimization
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
219
279
0
27 Nov 2018
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy
  Improvement
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Samuel Neumann
Sungsu Lim
A. Joseph
Yangchen Pan
Adam White
Martha White
356
10
0
22 Oct 2018
A Lyapunov-based Approach to Safe Reinforcement Learning
A Lyapunov-based Approach to Safe Reinforcement Learning
Yinlam Chow
Ofir Nachum
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
283
561
0
20 May 2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Ofir Nachum
Yinlam Chow
Mohammad Ghavamzadeh
175
49
0
10 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
1.3K
9,873
0
04 Jan 2018
Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy
  Regularization for Reinforcement Learning
Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy Regularization for Reinforcement Learning
Kyungjae Lee
Sungjoon Choi
Songhwai Oh
287
69
0
19 Sep 2017
Two-temperature logistic regression based on the Tsallis divergence
Two-temperature logistic regression based on the Tsallis divergence
Ehsan Amid
Manfred K. Warmuth
Sriram Srinivasan
NoLa
189
28
0
19 May 2017
Continuous Deep Q-Learning with Model-based Acceleration
Continuous Deep Q-Learning with Model-based Acceleration
S. Gu
Timothy Lillicrap
Ilya Sutskever
Sergey Levine
233
1,043
0
02 Mar 2016
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label
  Classification
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
André F. T. Martins
Ramón Fernández Astudillo
466
787
0
05 Feb 2016
1