The Golden Signals are four key metrics—latency, traffic, errors, and saturation—that provide a comprehensive, high-level view of a distributed system's health and performance. Originally popularized by Google's Site Reliability Engineering (SRE) practices, they answer the essential questions of how fast a system is, how much demand it's handling, the rate of failures, and how full its resources are. In the context of multi-agent system orchestration, these signals are critical for observing the collective behavior of interacting autonomous agents, moving beyond individual agent metrics to understand the health of the entire workflow.
