ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.01679
  4. Cited By
Apprenticeship Learning via Frank-Wolfe
v1v2 (latest)

Apprenticeship Learning via Frank-Wolfe

5 November 2019
Tom Zahavy
Alon Cohen
Haim Kaplan
Yishay Mansour
ArXiv (abs)PDFHTML

Papers citing "Apprenticeship Learning via Frank-Wolfe"

12 / 12 papers shown
Title
Online Episodic Convex Reinforcement Learning
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
194
0
0
12 May 2025
MetaCURL: Non-stationary Concave Utility Reinforcement Learning
MetaCURL: Non-stationary Concave Utility Reinforcement Learning
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
OffRL
87
1
0
30 May 2024
Double Duality: Variational Primal-Dual Policy Optimization for
  Constrained Reinforcement Learning
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Zihao Li
Boyi Liu
Zhuoran Yang
Zhaoran Wang
Mengdi Wang
80
1
0
16 Feb 2024
Marketing Budget Allocation with Offline Constrained Deep Reinforcement
  Learning
Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning
Tianchi Cai
Jiyan Jiang
Wenpeng Zhang
Shiji Zhou
Xierui Song
Li Yu
Lihong Gu
Xiaodong Zeng
Jinjie Gu
Guannan Zhang
OffRL
47
3
0
06 Sep 2023
Diversifying AI: Towards Creative Chess with AlphaZero
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
87
16
0
17 Aug 2023
Provably Efficient Adversarial Imitation Learning with Unknown
  Transitions
Provably Efficient Adversarial Imitation Learning with Unknown Transitions
Tian Xu
Ziniu Li
Yang Yu
Zhimin Luo
64
10
0
11 Jun 2023
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for
  Last-Iterate Convergence in Constrained MDPs
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Theodore H. Moskovitz
Brendan O'Donoghue
Vivek Veeriah
Sebastian Flennerhag
Satinder Singh
Tom Zahavy
96
21
0
02 Feb 2023
Improved Policy Optimization for Online Imitation Learning
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark Schmidt
OffRL
79
6
0
29 Jul 2022
MADE: Exploration via Maximizing Deviation from Explored Regions
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
96
43
0
18 Jun 2021
Reward is enough for convex MDPs
Reward is enough for convex MDPs
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
131
76
0
01 Jun 2021
Online Apprenticeship Learning
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
77
26
0
13 Feb 2021
Discovering a set of policies for the worst case reward
Discovering a set of policies for the worst case reward
Tom Zahavy
André Barreto
D. Mankowitz
Shaobo Hou
Brendan O'Donoghue
Iurii Kemaev
Satinder Singh
OffRL
61
23
0
08 Feb 2021
1