ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.12976
  4. Cited By
Learning and Planning for Time-Varying MDPs Using Maximum Likelihood
  Estimation
v1v2 (latest)

Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation

29 November 2019
Melkior Ornik
Ufuk Topcu
    OOD
ArXiv (abs)PDFHTML

Papers citing "Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation"

9 / 9 papers shown
Title
Time-Constrained Robust MDPs
Time-Constrained Robust MDPs
Adil Zouitine
David Bertoin
Pierre Clavier
Matthieu Geist
Emmanuel Rachelson
OOD
66
1
0
12 Jun 2024
A Moral Imperative: The Need for Continual Superalignment of Large
  Language Models
A Moral Imperative: The Need for Continual Superalignment of Large Language Models
Gokul Puthumanaillam
Manav Vora
Pranay Thangeda
Melkior Ornik
87
7
0
13 Mar 2024
Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying
  Partially Observable Environment
Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying Partially Observable Environment
Gokul Puthumanaillam
Xiangyu Liu
Negar Mehr
Melkior Ornik
66
5
0
06 Dec 2023
The complexity of non-stationary reinforcement learning
The complexity of non-stationary reinforcement learning
Christos H. Papadimitriou
Binghui Peng
48
3
0
13 Jul 2023
Client Selection for Federated Policy Optimization with Environment
  Heterogeneity
Client Selection for Federated Policy Optimization with Environment Heterogeneity
Zhijie Xie
S. H. Song
61
4
0
18 May 2023
Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under
  Probabilistic Agent Dropout
Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under Probabilistic Agent Dropout
Carmel Fiscko
S. Kar
Bruno Sinopoli
60
1
0
24 Apr 2023
Unsupervised Person Re-identification via Simultaneous Clustering and
  Consistency Learning
Unsupervised Person Re-identification via Simultaneous Clustering and Consistency Learning
Hanne I Oberman
Jiayan Qiu
S. van Buuren
G. Vink
Zhanyu Ma
Jun Guo
57
15
0
01 Apr 2021
Robust Policy Gradient against Strong Data Corruption
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
99
39
0
11 Feb 2021
Optimizing for the Future in Non-Stationary MDPs
Optimizing for the Future in Non-Stationary MDPs
Yash Chandak
Georgios Theocharous
Shiv Shankar
Martha White
Sridhar Mahadevan
Philip S. Thomas
OffRL
92
65
0
17 May 2020
1