ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.09321
  4. Cited By
Improving Policy Gradient by Exploring Under-appreciated Rewards

Improving Policy Gradient by Exploring Under-appreciated Rewards

28 November 2016
Ofir Nachum
Mohammad Norouzi
Dale Schuurmans
ArXivPDFHTML

Papers citing "Improving Policy Gradient by Exploring Under-appreciated Rewards"

9 / 9 papers shown
Title
Policy Gradient Algorithms Implicitly Optimize by Continuation
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
36
3
0
11 May 2023
Learning to Reach Goals via Iterated Supervised Learning
Learning to Reach Goals via Iterated Supervised Learning
Dibya Ghosh
Abhishek Gupta
Ashwin Reddy
Justin Fu
Coline Devin
Benjamin Eysenbach
Sergey Levine
24
33
0
12 Dec 2019
Countering the Effects of Lead Bias in News Summarization via
  Multi-Stage Training and Auxiliary Losses
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses
Matt Grenander
Yue Dong
Jackie C.K. Cheung
Annie Louis
14
35
0
08 Sep 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
35
185
0
05 Jun 2019
Efficient Entropy for Policy Gradient with Multidimensional Action Space
Efficient Entropy for Policy Gradient with Multidimensional Action Space
Yiming Zhang
Q. Vuong
Kenny Song
Xiao-Yue Gong
Keith Ross
25
16
0
02 Jun 2018
Neural Architecture Search using Deep Neural Networks and Monte Carlo
  Tree Search
Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
Yiyang Zhao
Yuu Jinnai
Yuandong Tian
Rodrigo Fonseca
BDL
23
50
0
18 May 2018
From Language to Programs: Bridging Reinforcement Learning and Maximum
  Marginal Likelihood
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Kelvin Guu
Panupong Pasupat
E. Liu
Percy Liang
34
190
0
25 Apr 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,502
0
25 Jan 2017
An Alternative Softmax Operator for Reinforcement Learning
An Alternative Softmax Operator for Reinforcement Learning
Kavosh Asadi
Michael L. Littman
20
10
0
16 Dec 2016
1