ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.04779
  4. Cited By
Sample-Optimal Parametric Q-Learning Using Linearly Additive Features
v1v2 (latest)

Sample-Optimal Parametric Q-Learning Using Linearly Additive Features

13 February 2019
Lin F. Yang
Mengdi Wang
    VLM
ArXiv (abs)PDFHTML

Papers citing "Sample-Optimal Parametric Q-Learning Using Linearly Additive Features"

10 / 10 papers shown
Title
Model-Based Reinforcement Learning with Value-Targeted Regression
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
118
306
0
01 Jun 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
91
222
0
29 Feb 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function
  Approximation and Correlated Equilibrium
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Qiaomin Xie
Yudong Chen
Zhaoran Wang
Zhuoran Yang
170
127
0
17 Feb 2020
Can Agents Learn by Analogy? An Inferable Model for PAC Reinforcement
  Learning
Can Agents Learn by Analogy? An Inferable Model for PAC Reinforcement Learning
Yanchao Sun
Furong Huang
63
4
0
21 Dec 2019
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
Andrea Zanette
David Brandfonbrener
Emma Brunskill
Matteo Pirotta
A. Lazaric
150
132
0
01 Nov 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
90
111
0
25 Jun 2019
Feature-Based Q-Learning for Two-Player Stochastic Games
Feature-Based Q-Learning for Two-Player Stochastic Games
Zeyu Jia
Lin F. Yang
Mengdi Wang
98
45
0
02 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and
  Regret Bound
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRLGP
116
288
0
24 May 2019
AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov
  Decision Processes with Near-Optimal Sample Complexity
AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity
Yibo Zeng
Fei Feng
W. Yin
56
3
0
03 Dec 2018
State Aggregation Learning from Markov Transition Data
State Aggregation Learning from Markov Transition Data
Shiqi Wang
Yizheng Chen
Ahmed Abdou
115
54
0
06 Nov 2018
1