Hierarchical Reinforcement Learning (HRL) is a machine learning paradigm that structures an agent's decision-making process into multiple levels of temporal abstraction to solve complex, long-horizon tasks. Instead of learning a single, monolithic policy over primitive actions, HRL decomposes the overall problem into a hierarchy of subtasks or skills, often called options or skills. Higher-level policies select among these temporally extended actions, which then execute their own lower-level policies until termination, enabling the agent to reason and plan over longer time horizons.




