ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.12141
  4. Cited By
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
v1v2v3v4 (latest)

Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy

International Conference on Machine Learning (ICML), 2022
25 July 2022
Xiyao Wang
Wichayaporn Wongkamjan
Furong Huang
ArXiv (abs)PDFHTML

Papers citing "Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy"

17 / 17 papers shown
MrCoM: A Meta-Regularized World-Model Generalizing Across Multi-Scenarios
MrCoM: A Meta-Regularized World-Model Generalizing Across Multi-Scenarios
Xuantang Xiong
Ni Mu
Runpeng Xie
Senhao Yang
Y. Wang
...
Yao Luan
Siyuan Li
Shuang Xu
Yiqin Yang
Bo Xu
OffRL
124
0
0
09 Nov 2025
Fixing That Free Lunch: When, Where, and Why Synthetic Data Fails in Model-Based Policy Optimization
Fixing That Free Lunch: When, Where, and Why Synthetic Data Fails in Model-Based Policy Optimization
Brett Barkley
David Fridovich-Keil
OffRL
152
0
0
01 Oct 2025
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
Xinze Wang
Zhiyong Yang
Chao Feng
Hongjin Lu
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Furong Huang
Lijuan Wang
OODDReLMLRMVLM
586
73
0
10 Apr 2025
Learning World Models for Unconstrained Goal Navigation
Learning World Models for Unconstrained Goal NavigationNeural Information Processing Systems (NeurIPS), 2024
Yuanlin Duan
Wensen Mao
He Zhu
231
5
0
03 Nov 2024
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge
  with Curriculum Preference Learning
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Xiyao Wang
Linfeng Song
Ye Tian
Dian Yu
Baolin Peng
Haitao Mi
Furong Huang
Dong Yu
LRM
294
22
0
09 Oct 2024
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in
  Sequence-Level Knowledge Distillation
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge DistillationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yuhang Zhou
Jing Zhu
Paiheng Xu
Xiaoyu Liu
Xiyao Wang
Danai Koutra
Wei Ai
Furong Huang
279
6
0
19 Jun 2024
Learning from Random Demonstrations: Offline Reinforcement Learning with
  Importance-Sampled Diffusion Models
Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models
Zeyu Fang
Tian Lan
OffRL
363
6
0
30 May 2024
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with
  Uncertainty-Aware Rollout Adaption
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Bernd Frauenknecht
Artur Eisele
Devdutt Subhasish
Friedrich Solowjow
Sebastian Trimpe
380
5
0
29 May 2024
Mind the Model, Not the Agent: The Primacy Bias in Model-based RL
Mind the Model, Not the Agent: The Primacy Bias in Model-based RLEuropean Conference on Artificial Intelligence (ECAI), 2023
Zhongjian Qiao
Jiafei Lyu
Xiu Li
237
4
0
23 Oct 2023
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically
  for Model-Based RL
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RLInternational Conference on Learning Representations (ICLR), 2023
Xiyao Wang
Ruijie Zheng
Yanchao Sun
Ruonan Jia
Wichayaporn Wongkamjan
Huazhe Xu
Furong Huang
OffRL
292
17
0
11 Oct 2023
A Unified View on Solving Objective Mismatch in Model-Based
  Reinforcement Learning
A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning
Ran Wei
Nathan Lambert
Anthony D. McDonald
Alfredo Garcia
Roberto Calandra
303
9
0
10 Oct 2023
How to Fine-tune the Model: Unified Model Shift and Model Bias Policy
  Optimization
How to Fine-tune the Model: Unified Model Shift and Model Bias Policy OptimizationNeural Information Processing Systems (NeurIPS), 2023
Hai Zhang
Hang Yu
Siyue Tao
Di Zhang
Chang Huang
Hongtu Zhou
Xiao Zhang
Chen Ye
295
12
0
22 Sep 2023
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy
  Actor-Critic
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-CriticInternational Conference on Machine Learning (ICML), 2023
Tianying Ji
Yuping Luo
Gang Hua
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRLOnRL
398
21
0
05 Jun 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Query-Policy Misalignment in Preference-Based Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya Zhang
345
14
0
27 May 2023
TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning
  via Transition Occupancy Matching
TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy MatchingConference on Learning for Dynamics & Control (L4DC), 2023
Yecheng Jason Ma
K. Sivakumar
Jason Yan
Osbert Bastani
Dinesh Jayaraman
OffRLMU
200
7
0
22 May 2023
Relative Policy-Transition Optimization for Fast Policy Transfer
Relative Policy-Transition Optimization for Fast Policy TransferAAAI Conference on Artificial Intelligence (AAAI), 2022
Jiawei Xu
Cheng Zhou
Yizheng Zhang
Zhengyou Zhang
Lei Han
136
0
0
13 Jun 2022
ED2: Environment Dynamics Decomposition World Models for Continuous
  Control
ED2: Environment Dynamics Decomposition World Models for Continuous Control
Jianye Hao
Yifu Yuan
Cong Wang
Zhen Wang
OffRL
214
3
0
06 Dec 2021
1