A Value Equivalent Model is a learned internal model within a model-based reinforcement learning (MBRL) agent that is accurate only for the purpose of computing optimal values and policies, rather than needing to match the true environment's state transitions exactly. This concept, central to algorithms like DeepMind's MuZero, shifts the modeling objective from perfect system identification to learning a representation sufficient for high-quality planning and decision-making.
