ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.16687
72
0

Subgoal Discovery Using a Free Energy Paradigm and State Aggregations

21 December 2024
Amirhossein Mesbah
Reshad Hosseini
Seyed Pooya Shariatpanahi
M. N. Ahmadabadi
ArXivPDFHTML
Abstract

Reinforcement learning (RL) plays a major role in solving complex sequential decision-making tasks. Hierarchical and goal-conditioned RL are promising methods for dealing with two major problems in RL, namely sample inefficiency and difficulties in reward shaping. These methods tackle the mentioned problems by decomposing a task into simpler subtasks and temporally abstracting a task in the action space. One of the key components for task decomposition of these methods is subgoal discovery. We can use the subgoal states to define hierarchies of actions and also use them in decomposing complex tasks. Under the assumption that subgoal states are more unpredictable, we propose a free energy paradigm to discover them. This is achieved by using free energy to select between two spaces, the main space and an aggregation space. The model  changesmodel \; changesmodelchanges from neighboring states to a given state shows the unpredictability of a given state, and therefore it is used in this paper for subgoal discovery. Our empirical results on navigation tasks like grid-world environments show that our proposed method can be applied for subgoal discovery without prior knowledge of the task. Our proposed method is also robust to the stochasticity of environments.

View on arXiv
@article{mesbah2025_2412.16687,
  title={ Subgoal Discovery Using a Free Energy Paradigm and State Aggregations },
  author={ Amirhossein Mesbah and Reshad Hosseini and Seyed Pooya Shariatpanahi and Majid Nili Ahmadabadi },
  journal={arXiv preprint arXiv:2412.16687},
  year={ 2025 }
}
Comments on this paper