Model-based reinforcement learning (MBRL) is a paradigm in which an agent learns an explicit internal model of its environment's dynamics—typically the transition function (predicting the next state) and the reward function—and uses this learned model for planning and policy improvement, rather than relying solely on trial-and-error experience. This approach contrasts with model-free reinforcement learning, which directly learns a value function or policy from environmental interactions. The learned model acts as a simulator, allowing the agent to predict outcomes of potential action sequences without costly real-world execution, which is a core component of systems with world models.
