Actor-Critic architecture is a reinforcement learning framework that combines a policy network (actor) and a value network (critic). The actor selects actions based on the current state, while the critic evaluates those actions by estimating the state-value function. This separation allows for more stable and efficient learning than pure policy-gradient or value-based methods alone, as the critic provides a lower-variance baseline for policy updates. It is foundational to algorithms like A3C, PPO, and SAC.




