Guiding Hiearchical Reinforcement Learning in Partially Observable Environments with AI Planning

Brandon Rozek; Junkyu Lee; Harsha Kokel; Michael Katz; Shirin Sohrabi

Guiding Hiearchical Reinforcement Learning in Partially Observable Environments with AI Planning

Authors: Brandon Rozek, Junkyu Lee, Harsha Kokel, Michael Katz, Shirin Sohrabi,

Conference: International Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL)

Publication Date: 2024/06/02

Abstract: Partially observable Markov decision processes challenge reinforcement learning agents since observations provide an limited view of the environment. This often requires an agent to explore collecting observations to form the necessary state information to complete the task. Even assuming knowledge is monotonic, it is difficult to know when to stop exploration. We integrate AI planning within hierarchical reinforcement learning to aide in the exploration of partially observable environments. Given a set of unknown state variables, their potential valuations, along with which abstract operators may discover them, we create an abstract fully-observable non-deterministic planning problem which captures the agent’s abstract belief state. This decomposes the POMDP into a tree of semi-POMDPs based on sensing outcomes. We evaluate our agent’s performance on a MiniGrid domain and show how guided exploration may improve agent performance.

PDF Link: https://prl-theworkshop.github.io/prl2024-icaps/papers/12.pdf