Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When
to Act

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act

16 March 2022

Olivier Pietquin

ArXiv (abs)PDF HTML

Papers citing "Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act"

5 / 5 papers shown

Title
GPO: Learning from Critical Steps to Improve LLM Reasoning Jiahao Yu Zelei Cheng Xian Wu Xinyu Xing LRM 115 1 0 19 Sep 2025
Learning to Make Adherence-Aware AdviceInternational Conference on Learning Representations (ICLR), 2023 Guanting Chen Xiaocheng Li Chunlin Sun Hanzhao Wang 112 15 0 01 Oct 2023
Temporally Layered Architecture for Adaptive, Distributed and Continuous ControlAdaptive Agents and Multi-Agent Systems (AAMAS), 2022 Devdhar Patel Joshua Russell Frances Walsh T. Rahman Terrance Sejnowski H. Siegelmann AI4CE 254 1 0 25 Dec 2022
The Best Decisions Are Not the Best Advice: Making Adherence-Aware RecommendationsManagement Sciences (MS), 2022 Julien Grand-Clément Jean Pauphilet OffRL 370 16 0 05 Sep 2022
Learning to Switch Among Agents in a Team via 2-Layer Markov Decision Processes Vahid Balazadeh Meresht Abir De Adish Singla Manuel Gomez-Rodriguez 173 4 0 11 Feb 2020