An agent policy is the core decision-making function—implemented as a set of rules, a learned model, or a search algorithm—that maps an agent's perceived state and internal beliefs to its chosen actions, thereby governing its autonomous behavior within an environment. In reinforcement learning, it is often a neural network trained to maximize cumulative reward, while in symbolic agent-oriented programming, it may be a collection of condition-action rules or a BDI (Belief-Desire-Intention) reasoning cycle. The policy is the executable embodiment of the agent's strategy for achieving its goals.
