On-line Policy Improvement using Monte-Carlo Search

9 January 2025

Papers citing "On-line Policy Improvement using Monte-Carlo Search"

50 / 52 papers shown

Title
A Survey on Self-play Methods in Reinforcement Learning Ruize Zhang Zelai Xu Chengdong Ma Chao Yu Weijuan Tu ... Deheng Ye Wenbo Ding Yaodong Yang Yu Wang Yu Wang SyDa SSL OnRL 51 8 0 02 Aug 2024
Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming Dimitri Bertsekas 41 6 0 02 Jun 2024
An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking Pratyusha Musunuru Yuchao Li Jamison Weber Dimitri P. Bertsekas 43 0 0 24 May 2024
Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective Victor-Alexandru Darvariu Stephen Hailes Mirco Musolesi AI4CE 50 6 0 09 Apr 2024
Tree Search in DAG Space with Model-based Reinforcement Learning for Causal Discovery Victor-Alexandru Darvariu Stephen Hailes Mirco Musolesi CML 46 2 0 20 Oct 2023
Iterative Option Discovery for Planning, by Planning Kenny Young Richard S. Sutton 25 2 0 02 Oct 2023
Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement Learning Hongyu Ding Yuan-Yan Tang Qing Wu Bo Wang Chunlin Chen Zhi Wang 37 4 0 16 Jul 2023
The Update-Equivalence Framework for Decision-Time Planning Samuel Sokota Gabriele Farina David J. Wu Hengyuan Hu Kevin A. Wang J. Zico Kolter Noam Brown 30 3 0 25 Apr 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games Anna Winnicki R. Srikant 34 1 0 17 Mar 2023
Multiagent Rollout with Reshuffling for Warehouse Robots Path Planning William Emanuelsson Alejandro Penacho Riveiros Yuchao Li Karl H. Johansson Jonas Mårtensson 22 1 0 15 Nov 2022
Nested Search versus Limited Discrepancy Search Tristan Cazenave 32 0 0 01 Oct 2022
Regret Analysis for Hierarchical Experts Bandit Problem Qihan Guo Siwei Wang Jun Zhu 24 0 0 11 Aug 2022
A Survey on Model-based Reinforcement Learning Fan Luo Tian Xu Hang Lai Xiong-Hui Chen Weinan Zhang Yang Yu OffRL LRM 44 101 0 19 Jun 2022
Learning from Drivers to Tackle the Amazon Last Mile Routing Research Challenge Chen Wu Yin Song Verdi March Eden Duthie 32 7 0 09 May 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation Maximilian Igl Daewoo Kim Alex Kuefler Paul Mougin Punit Shah K. Shiarlis Drago Anguelov Mark Palatucci Brandyn White Shimon Whiteson 35 64 0 06 May 2022
A Dynamic Programming Algorithm for Finding an Optimal Sequence of Informative Measurements P. Loxley Ka Wai Cheung 23 3 0 24 Sep 2021
Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control Dimitri Bertsekas AI4CE 50 55 0 20 Aug 2021
Model-Based Opponent Modeling Xiaopeng Yu Jiechuan Jiang Wanpeng Zhang Haobin Jiang Zongqing Lu OffRL 27 28 0 04 Aug 2021
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN Shai Ben-Assayag Ran El-Yaniv GNN 27 9 0 18 Jul 2021
Leveraging Tripartite Interaction Information from Live Stream E-Commerce for Improving Product Recommendation Sanshi Lei Yu Zhuoxuan Jiang Dongdong Chen Shanshan Feng Dongsheng Li Qi Liu Jinfeng Yi 38 20 0 07 Jun 2021
Annotating Motion Primitives for Simplifying Action Search in Reinforcement Learning I. Sledge Darshan W. Bryner José C. Príncipe 20 1 0 24 Feb 2021
Monte Carlo Rollout Policy for Recommendation Systems with Dynamic User Behavior R. Meshram Kesav Kaza OffRL 19 1 0 08 Feb 2021
Deep Controlled Learning for Inventory Control Tarkan Temizoz Christina Imdahl R. Dijkman Douniel Lamghari-Idrissi W. Jaarsveld 24 8 0 30 Nov 2020
On the role of planning in model-based deep reinforcement learning Jessica B. Hamrick A. Friesen Feryal M. P. Behbahani A. Guez Fabio Viola Sims Witherspoon Thomas W. Anthony Lars Buesing Petar Velickovic T. Weber OffRL 19 65 0 08 Nov 2020
Lifelong Incremental Reinforcement Learning with Online Bayesian Inference Zhi Wang Chunlin Chen D. Dong CLL OffRL 12 56 0 28 Jul 2020
Simulation Based Algorithms for Markov Decision Processes and Multi-Action Restless Bandits R. Meshram Kesav Kaza 22 10 0 25 Jul 2020
Model-based Reinforcement Learning: A Survey Thomas M. Moerland Joost Broekens Aske Plaat Catholijn M. Jonker OffRL 25 47 0 30 Jun 2020
A Unifying Framework for Reinforcement Learning and Planning Thomas M. Moerland Joost Broekens Aske Plaat Catholijn M. Jonker OffRL 27 9 0 26 Jun 2020
Continuous Control for Searching and Planning with a Learned Model Xuxi Yang Werner Duvaud Peng Wei 16 5 0 12 Jun 2020
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework Ngoc Duy Nguyen Thanh Thi Nguyen Hai V. Nguyen Doug Creighton S. Nahavandi 27 3 0 27 Feb 2020
Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm Dimitri Bertsekas 22 11 0 18 Feb 2020
Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems Sushmita Bhattacharya Sahil Badyal Thomas Wheeler Stephanie Gil Dimitri Bertsekas 27 33 0 11 Feb 2020
The Choice Function Framework for Online Policy Improvement Murugeswari Issakkimuthu Alan Fern Prasad Tadepalli OffRL 17 1 0 01 Oct 2019
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees Thomas W. Anthony Robert Nishihara Philipp Moritz Tim Salimans John Schulman 17 30 0 07 Apr 2019
Learn a Prior for RHEA for Better Online Planning Xinyao Tong W. Liu Bin Li OffRL 26 0 0 14 Feb 2019
Learning 6-DoF Grasping and Pick-Place Using Attention Focus Marcus Gualtieri Robert W. Platt 14 56 0 15 Jun 2018
Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning Yonathan Efroni Gal Dalal B. Scherrer Shie Mannor OffRL 17 14 0 21 May 2018
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations Dimitri Bertsekas OffRL 33 131 0 12 Apr 2018
Beyond the One Step Greedy Approach in Reinforcement Learning Yonathan Efroni Gal Dalal B. Scherrer Shie Mannor OffRL 50 48 0 10 Feb 2018
Learning the Reward Function for a Misspecified Model Erik Talvitie 22 10 0 29 Jan 2018
A Survey on Compiler Autotuning using Machine Learning Amir H. Ashouri W. Killian John Cavazos G. Palermo Cristina Silvano 35 199 0 13 Jan 2018
Imagination-Augmented Agents for Deep Reinforcement Learning T. Weber S. Racanière David P. Reichert Lars Buesing A. Guez ... Razvan Pascanu Peter W. Battaglia Demis Hassabis David Silver Daan Wierstra LM&Ro 51 549 0 19 Jul 2017
Multi-Labelled Value Networks for Computer Go Ti-Rong Wu I-Chen Wu Guan-Wun Chen Ting Han Wei Tung-Yi Lai Hung-Chun Wu Li-Cheng Lan 36 22 0 30 May 2017
Self-Correcting Models for Model-Based Reinforcement Learning Erik Talvitie LRM 29 92 0 19 Dec 2016
Approximate Policy Iteration for Budgeted Semantic Video Segmentation Behrooz Mahasseni S. Todorovic Alan Fern 22 4 0 26 Jul 2016
Using Monte Carlo Search With Data Aggregation to Improve Robot Soccer Policies Francesco Riccio Roberto Capobianco Daniele Nardi 29 4 0 01 Jun 2016
Classification-based Approximate Policy Iteration: Experiments and Extended Discussions Amir-massoud Farahmand Doina Precup André Barreto Mohammad Ghavamzadeh OffRL 50 7 0 02 Jul 2014
Analysis of Watson's Strategies for Playing Jeopardy! Gerald Tesauro David Gondek J. Lenchner James Fan J. Prager 47 34 0 04 Feb 2014
Learning to Win by Reading Manuals in a Monte-Carlo Framework S. Branavan David Silver Regina Barzilay 49 190 0 18 Jan 2014
Monte Carlo Search Algorithm Discovery for One Player Games Francis Maes D. St-Pierre D. Ernst 65 3 0 23 Aug 2012