ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.07350
  4. Cited By
Policy Optimization with Model-based Explorations

Policy Optimization with Model-based Explorations

18 November 2018
Feiyang Pan
Qingpeng Cai
Anxiang Zeng
C. Pan
Qing Da
Hua-Lin He
Qing He
Pingzhong Tang
ArXivPDFHTML

Papers citing "Policy Optimization with Model-based Explorations"

9 / 9 papers shown
Title
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback
Haolin Liu
Chen-Yu Wei
Julian Zimmert
22
6
0
17 Oct 2023
Uncertainty-aware transfer across tasks using hybrid model-based
  successor feature reinforcement learning
Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning
Parvin Malekzadeh
Ming Hou
Konstantinos N. Plataniotis
51
1
0
16 Oct 2023
On the Origins of Self-Modeling
On the Origins of Self-Modeling
Robert Kwiatkowski
Yuhang Hu
Boyuan Chen
Hod Lipson
19
4
0
05 Sep 2022
Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement
  Learning For Optimal Execution
Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution
Feiyang Pan
Tongzhe Zhang
Ling Luo
Jia He
Shuoli Liu
14
7
0
22 Jul 2022
Trust the Model When It Is Confident: Masked Model-based Actor-Critic
Trust the Model When It Is Confident: Masked Model-based Actor-Critic
Feiyang Pan
Jia He
Dandan Tu
Qing He
OffRL
20
47
0
10 Oct 2020
GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
Jianfeng Liu
Feiyang Pan
Ling Luo
OffRL
20
23
0
24 May 2020
Zero Shot Learning on Simulated Robots
Zero Shot Learning on Simulated Robots
Robert Kwiatkowski
Hod Lipson
11
0
0
04 Oct 2019
Field-aware Calibration: A Simple and Empirically Strong Method for
  Reliable Probabilistic Predictions
Field-aware Calibration: A Simple and Empirically Strong Method for Reliable Probabilistic Predictions
Feiyang Pan
Xiang Ao
Pingzhong Tang
Min Lu
Dapeng Liu
Lei Xiao
Qing He
30
22
0
26 May 2019
Warm Up Cold-start Advertisements: Improving CTR Predictions via
  Learning to Learn ID Embeddings
Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings
Feiyang Pan
Shuokai Li
Xiang Ao
Pingzhong Tang
Qing He
23
184
0
25 Apr 2019
1