ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.02994
  4. Cited By
Towards an Understanding of Default Policies in Multitask Policy
  Optimization

Towards an Understanding of Default Policies in Multitask Policy Optimization

4 November 2021
Theodore H. Moskovitz
Michael Arbel
Jack Parker-Holder
Aldo Pacchiano
ArXivPDFHTML

Papers citing "Towards an Understanding of Default Policies in Multitask Policy Optimization"

9 / 9 papers shown
Title
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
34
47
0
06 Oct 2023
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for
  Last-Iterate Convergence in Constrained MDPs
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Theodore H. Moskovitz
Brendan O'Donoghue
Vivek Veeriah
Sebastian Flennerhag
Satinder Singh
Tom Zahavy
42
19
0
02 Feb 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi Ma
Sergey Levine
37
14
0
24 Dec 2022
Transfer RL via the Undo Maps Formalism
Transfer RL via the Undo Maps Formalism
Abhi Gupta
Theodore H. Moskovitz
David Alvarez-Melis
Aldo Pacchiano
OffRL
25
0
0
26 Nov 2022
Minimum Description Length Control
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
26
1
0
17 Jul 2022
Joint Representation Training in Sequential Tasks with Shared Structure
Joint Representation Training in Sequential Tasks with Shared Structure
Aldo Pacchiano
Ofir Nachum
Nilseh Tripuraneni
Peter L. Bartlett
33
5
0
24 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron C. Courville
Marc G. Bellemare
OffRL
OnRL
26
63
0
03 Jun 2022
A First-Occupancy Representation for Reinforcement Learning
A First-Occupancy Representation for Reinforcement Learning
Theodore H. Moskovitz
S. Wilson
M. Sahani
31
15
0
28 Sep 2021
Iterative Amortized Policy Optimization
Iterative Amortized Policy Optimization
Joseph Marino
Alexandre Piché
Alessandro Davide Ialongo
Yisong Yue
OffRL
63
21
0
20 Oct 2020
1