ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1301.3878
  4. Cited By
PEGASUS: A Policy Search Method for Large MDPs and POMDPs

PEGASUS: A Policy Search Method for Large MDPs and POMDPs

Conference on Uncertainty in Artificial Intelligence (UAI), 2000
16 January 2013
A. Ng
Sai Li
ArXiv (abs)PDFHTML

Papers citing "PEGASUS: A Policy Search Method for Large MDPs and POMDPs"

50 / 65 papers shown
Agent-state based policies in POMDPs: Beyond belief-state MDPs
Agent-state based policies in POMDPs: Beyond belief-state MDPsIEEE Conference on Decision and Control (CDC), 2024
Amit Sinha
Aditya Mahajan
213
10
0
24 Sep 2024
Reinforcement learning
Reinforcement learning
Florentin Wörgötter
585
2,932
0
16 May 2024
Body Schema Acquisition through Active Learning
Body Schema Acquisition through Active Learning
Ruben Martinez-Cantin
M. Lopes
Luis Montesano
110
52
0
08 Feb 2024
ExploitFlow, cyber security exploitation routes for Game Theory and AI
  research in robotics
ExploitFlow, cyber security exploitation routes for Game Theory and AI research in robotics
Víctor Mayoral-Vilches
Gelei Deng
Yi Liu
M. Pinzger
Stefan Rass
126
4
0
04 Aug 2023
Sample Average Approximation for Black-Box VI
Sample Average Approximation for Black-Box VI
Javier Burroni
Justin Domke
Daniel Sheldon
200
6
0
13 Apr 2023
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box
  Optimization
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box OptimizationAnnual Conference on Genetic and Evolutionary Computation (GECCO), 2023
R. T. Lange
Tom Schaul
Yutian Chen
Chris Xiaoxuan Lu
Tom Zahavy
Valentin Dalibard
Sebastian Flennerhag
277
39
0
08 Apr 2023
Relative Sparsity for Medical Decision Problems
Relative Sparsity for Medical Decision ProblemsStatistics in Medicine (Stat Med), 2022
Samuel J. Weisenthal
Sally W. Thurston
Ashkan Ertefaie
203
4
0
29 Nov 2022
Discovering Evolution Strategies via Meta-Black-Box Optimization
Discovering Evolution Strategies via Meta-Black-Box OptimizationInternational Conference on Learning Representations (ICLR), 2022
R. T. Lange
Tom Schaul
Yutian Chen
Tom Zahavy
Valenti Dallibard
Chris Xiaoxuan Lu
Satinder Singh
Sebastian Flennerhag
341
55
0
21 Nov 2022
Hindsight Learning for MDPs with Exogenous Inputs
Hindsight Learning for MDPs with Exogenous InputsInternational Conference on Machine Learning (ICML), 2022
Sean R. Sinclair
Felipe Vieira Frujeri
Ching-An Cheng
Luke Marshall
Hugo Barbalho
Jingling Li
Jennifer Neville
Ishai Menache
Adith Swaminathan
220
29
0
13 Jul 2022
Cluster-Based Control of Transition-Independent MDPs
Cluster-Based Control of Transition-Independent MDPs
Carmel Fiscko
S. Kar
Bruno Sinopoli
184
2
0
11 Jul 2022
The Parametric Cost Function Approximation: A new approach for
  multistage stochastic programming
The Parametric Cost Function Approximation: A new approach for multistage stochastic programming
Warrren B Powell
Saeed Ghadimi
116
7
0
01 Jan 2022
Robot Learning from Randomized Simulations: A Review
Robot Learning from Randomized Simulations: A ReviewFrontiers in Robotics and AI (Front. Robot. AI), 2021
Fabio Muratore
Fabio Ramos
Greg Turk
Wenhao Yu
Michael Gienger
Jan Peters
AI4CE
322
111
0
01 Nov 2021
Robust Predictable Control
Robust Predictable ControlNeural Information Processing Systems (NeurIPS), 2021
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
221
49
0
07 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
298
99
0
01 Sep 2021
Reinforcement Learning to Optimize Lifetime Value in Cold-Start
  Recommendation
Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation
Luo Ji
Qin Qi
Bingqing Han
Hongxia Yang
OffRL
113
31
0
20 Aug 2021
Partially Observable Markov Decision Processes (POMDPs) and Robotics
Partially Observable Markov Decision Processes (POMDPs) and Robotics
H. Kurniawati
182
19
0
15 Jul 2021
A Bayesian Approach to Identifying Representational Errors
A Bayesian Approach to Identifying Representational Errors
Ramya Ramakrishnan
Vaibhav Unhelkar
Ece Kamar
J. Shah
193
4
0
28 Mar 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
RL for Latent MDPs: Regret Guarantees and a Lower BoundNeural Information Processing Systems (NeurIPS), 2021
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
231
88
0
09 Feb 2021
Model-Based Policy Search Using Monte Carlo Gradient Estimation with
  Real Systems Application
Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems ApplicationIEEE Transactions on robotics (TRO), 2021
Fabio Amadio
Alberto Dalla Libera
R. Antonello
D. Nikovski
R. Carli
Diego Romeres
293
36
0
28 Jan 2021
Locally Persistent Exploration in Continuous Control Tasks with Sparse
  Rewards
Locally Persistent Exploration in Continuous Control Tasks with Sparse RewardsInternational Conference on Machine Learning (ICML), 2020
Susan Amin
Maziar Gomrokchi
Hossein Aboutalebi
Harsh Satija
Doina Precup
172
17
0
26 Dec 2020
Counterfactual Credit Assignment in Model-Free Reinforcement Learning
Counterfactual Credit Assignment in Model-Free Reinforcement LearningInternational Conference on Machine Learning (ICML), 2020
Thomas Mesnard
T. Weber
Fabio Viola
S. Thakoor
Alaa Saade
...
A. Guez
Éric Moulines
Marcus Hutter
Lars Buesing
Rémi Munos
CMLOffRL
234
67
0
18 Nov 2020
A Study of Policy Gradient on a Class of Exactly Solvable Models
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
148
0
0
03 Nov 2020
Average-reward model-free reinforcement learning: a systematic review
  and literature mapping
Average-reward model-free reinforcement learning: a systematic review and literature mapping
Vektor Dewanto
George Dunn
A. Eshragh
M. Gallagher
Fred Roosta
249
38
0
18 Oct 2020
Reinforcement Learning
Reinforcement Learning
Olivier Buffer
Olivier Pietquin
Paul Weng
OffRL
114
0
0
29 May 2020
Influence-aware Memory Architectures for Deep Reinforcement Learning
Influence-aware Memory Architectures for Deep Reinforcement Learning
Miguel Suau
Jinke He
E. Congeduti
Rolf A. N. Starre
A. Czechowski
F. Oliehoek
194
5
0
18 Nov 2019
If MaxEnt RL is the Answer, What is the Question?
If MaxEnt RL is the Answer, What is the Question?
Benjamin Eysenbach
Sergey Levine
156
65
0
04 Oct 2019
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Longxiang Shi
Shijian Li
LongBing Cao
Long Yang
Gang Zheng
Gang Pan
223
5
0
01 Jul 2019
On Value Functions and the Agent-Environment Boundary
On Value Functions and the Agent-Environment Boundary
Nan Jiang
OffRL
346
25
0
30 May 2019
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos
Paavo Parmas
C. Rasmussen
Jan Peters
Kenji Doya
170
93
0
04 Feb 2019
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLMOffRL
361
143
0
15 Oct 2018
A Hybrid Approach for Trajectory Control Design
A Hybrid Approach for Trajectory Control Design
L. Freda
M. Gianni
F. Pirri
103
0
0
08 Oct 2018
Learning Scheduling Algorithms for Data Processing Clusters
Learning Scheduling Algorithms for Data Processing Clusters
Hongzi Mao
Malte Schwarzkopf
S. Venkatakrishnan
Zili Meng
Mohammad Alizadeh
OffRL
353
731
0
03 Oct 2018
Policy Optimization via Importance Sampling
Policy Optimization via Importance Sampling
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
OffRL
259
97
0
17 Sep 2018
A survey on policy search algorithms for learning robot controllers in a
  handful of trials
A survey on policy search algorithms for learning robot controllers in a handful of trialsIEEE Transactions on robotics (T-RO), 2018
Konstantinos Chatzilygeroudis
Vassilis Vassiliades
F. Stulp
Sylvain Calinon
Jean-Baptiste Mouret
434
168
0
06 Jul 2018
Variance Reduction for Reinforcement Learning in Input-Driven
  Environments
Variance Reduction for Reinforcement Learning in Input-Driven EnvironmentsInternational Conference on Learning Representations (ICLR), 2018
Hongzi Mao
S. Venkatakrishnan
Malte Schwarzkopf
Mohammad Alizadeh
OffRL
213
104
0
06 Jul 2018
Synthesizing Neural Network Controllers with Probabilistic Model based
  Reinforcement Learning
Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning
J. A. G. Higuera
David Meger
Gregory Dudek
BDL
175
40
0
06 Mar 2018
Recommendations with Negative Feedback via Pairwise Deep Reinforcement
  Learning
Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning
Xiangyu Zhao
Li Zhang
Zhuoye Ding
Long Xia
Shucheng Zhou
D. Yin
310
368
0
19 Feb 2018
Deep Reinforcement Learning for List-wise Recommendations
Deep Reinforcement Learning for List-wise Recommendations
Xiangyu Zhao
Li Zhang
Yue Zhao
Zhuoye Ding
D. Yin
Shucheng Zhou
313
182
0
30 Dec 2017
Data-driven Planning via Imitation Learning
Data-driven Planning via Imitation Learning
Sanjiban Choudhury
M. Bhardwaj
S. Arora
Ashish Kapoor
G. Ranade
Sebastian Scherer
Debadeepta Dey
207
90
0
17 Nov 2017
Data-Efficient Reinforcement Learning with Probabilistic Model
  Predictive Control
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
Sanket Kamthe
M. Deisenroth
336
227
0
20 Jun 2017
Dynamic Motion Planning for Aerial Surveillance on a Fixed-Wing UAV
Dynamic Motion Planning for Aerial Surveillance on a Fixed-Wing UAV
Vaibhav Darbari
Saksham Gupta
O. Verma
132
23
0
22 May 2017
Experimental results : Reinforcement Learning of POMDPs using Spectral
  Methods
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
188
9
0
07 May 2017
Black-Box Data-efficient Policy Search for Robotics
Black-Box Data-efficient Policy Search for Robotics
Konstantinos Chatzilygeroudis
R. Rama
Rituraj Kaushik
Dorian Goepp
Vassilis Vassiliades
Jean-Baptiste Mouret
OffRL
218
116
0
21 Mar 2017
Sample Efficient Policy Search for Optimal Stopping Domains
Sample Efficient Policy Search for Optimal Stopping DomainsInternational Joint Conference on Artificial Intelligence (IJCAI), 2017
Karan Goel
Christoph Dann
Emma Brunskill
96
8
0
21 Feb 2017
Reinforcement Learning Algorithm Selection
Reinforcement Learning Algorithm Selection
Romain Laroche
Raphael Feraud
OffRL
163
8
0
30 Jan 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
312
440
0
29 Oct 2016
DESPOT: Online POMDP Planning with Regularization
DESPOT: Online POMDP Planning with Regularization
N. Ye
A. Somani
David Hsu
Wee Sun Lee
398
528
0
12 Sep 2016
Configuration Lattices for Planar Contact Manipulation Under Uncertainty
Configuration Lattices for Planar Contact Manipulation Under Uncertainty
Michael C. Koval
David Hsu
N. S. Pollard
S. Srinivasa
250
12
0
30 Apr 2016
Reinforcement Learning of POMDPs using Spectral Methods
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
199
138
0
25 Feb 2016
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Sai Li
Pieter Abbeel
982
7,501
0
19 Feb 2015
12
Next