Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.09321
Cited By
Improving Policy Gradient by Exploring Under-appreciated Rewards
28 November 2016
Ofir Nachum
Mohammad Norouzi
Dale Schuurmans
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Policy Gradient by Exploring Under-appreciated Rewards"
9 / 9 papers shown
Title
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
36
3
0
11 May 2023
Learning to Reach Goals via Iterated Supervised Learning
Dibya Ghosh
Abhishek Gupta
Ashwin Reddy
Justin Fu
Coline Devin
Benjamin Eysenbach
Sergey Levine
24
33
0
12 Dec 2019
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses
Matt Grenander
Yue Dong
Jackie C.K. Cheung
Annie Louis
14
35
0
08 Sep 2019
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
35
185
0
05 Jun 2019
Efficient Entropy for Policy Gradient with Multidimensional Action Space
Yiming Zhang
Q. Vuong
Kenny Song
Xiao-Yue Gong
Keith Ross
25
16
0
02 Jun 2018
Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
Yiyang Zhao
Yuu Jinnai
Yuandong Tian
Rodrigo Fonseca
BDL
23
50
0
18 May 2018
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Kelvin Guu
Panupong Pasupat
E. Liu
Percy Liang
34
190
0
25 Apr 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,502
0
25 Jan 2017
An Alternative Softmax Operator for Reinforcement Learning
Kavosh Asadi
Michael L. Littman
20
10
0
16 Dec 2016
1