Dreamer is a model-based reinforcement learning algorithm that learns a compact Recurrent State-Space Model (RSSM) of environment dynamics and uses it to train policies and value functions entirely via latent imagination—backpropagation through time on imagined rollouts. This approach decouples policy learning from costly real-world interaction, achieving high sample efficiency by leveraging a learned world model for planning and optimization. The agent imagines future trajectories in its latent state space to evaluate and improve its decision-making strategy.
