A span is the fundamental unit of work in distributed tracing, representing a single named and timed operation within a larger request trace, such as a function call, database query, or HTTP request. Each span contains a unique identifier, a parent span ID to establish causality, and key metadata including a start/end timestamp, operation name, status code, and custom attributes or tags. Spans are the atomic records that, when linked together via their parent-child relationships, form a complete trace visualizing the entire path of a transaction.
Glossary
Span

What is a Span?
A span is the fundamental building block of distributed tracing, providing a detailed record of a single operation within a larger request.
In the context of agentic observability, spans are critical for instrumenting autonomous agents to monitor their internal reasoning loops, tool calls, and external API interactions. By emitting spans for each discrete step—like planning, retrieval, or execution—engineers can reconstruct the agent's decision path, measure latency for each cognitive operation, and pinpoint failures. Spans are typically propagated and collected using standards like OpenTelemetry (OTel) and the W3C TraceContext, enabling interoperability across heterogeneous services within a multi-agent system.
Key Components of a Span
A span is a structured data object representing a single operation. Its anatomy is defined by the OpenTelemetry specification, providing a standardized format for capturing work in a distributed system.
Span Name & Kind
The span name is a human-readable identifier for the operation (e.g., HTTP GET /api/users). The span kind categorizes the role of the span within a trace:
- SERVER: For the receiver of a remote request.
- CLIENT: For the initiator of a remote request.
- INTERNAL: For operations within an application boundary.
- PRODUCER & CONSUMER: For messaging systems. The name and kind are essential for semantic grouping and understanding the flow of a request.
Trace & Span Context
The trace context is the causal metadata that links spans together. It contains two critical identifiers:
- Trace ID: A globally unique 16-byte identifier shared by all spans in a single distributed trace.
- Span ID: An 8-byte identifier unique within its trace for the specific span. This context is propagated (e.g., via HTTP headers) across service boundaries, enabling the reconstruction of the full request path. A parent span ID explicitly defines the causal relationship within the trace tree.
Timestamps & Duration
Spans are fundamentally temporal units. They record precise timestamps:
- Start Timestamp: The nanosecond-precision UTC time when the operation began.
- End Timestamp: The time when the operation completed.
The duration is calculated as
(End Timestamp - Start Timestamp). This allows for precise latency analysis of individual operations and aggregate performance of entire traces. High-resolution timestamps are critical for identifying performance regressions.
Status & Events
The span status indicates the final outcome of the operation:
- Unset: The default, for successful operations.
- Ok: Explicitly set for successful operations.
- Error: Indicates the operation failed. Must be set for errors.
Span events (or annotations) are timestamped logs attached to a span, representing significant moments during its execution (e.g.,
exception thrown,cache.miss,message.sent). They provide a detailed, time-ordered narrative of the span's internal lifecycle.
Attributes
Span attributes are key-value pairs that describe the context of the operation. They are the primary mechanism for adding dimensional metadata for filtering and aggregation. Examples include:
http.method:"GET"db.system:"postgresql"net.peer.ip:"10.0.0.1"custom.business.id:"order_12345"Attributes should follow semantic conventions for consistency but can be extended with custom, business-specific data. They are indexed in observability backends for powerful querying.
Links
A span link creates a causal relationship between a span and another span in a different trace. This is crucial for modeling batch or asynchronous processing where a single operation is triggered by multiple initiating requests. Example: A background job that processes messages from a queue can create a span linked to the spans from each producer that enqueued those messages. Each link contains the Trace ID and Span ID of the linked context. Links differ from parent-child relationships, which exist within a single trace.
Frequently Asked Questions
A span is the fundamental building block of distributed tracing, representing a single, timed operation within a request's lifecycle. These FAQs address its core mechanics, role in observability, and implementation details for engineering teams.
A span is the fundamental unit of work in distributed tracing, representing a single named and timed operation within a larger request trace. It captures the execution of a discrete piece of logic, such as a function call, a database query, an HTTP request to an external service, or a computational step within an autonomous agent's reasoning loop.
Each span contains critical metadata:
- Operation Name: A descriptive label (e.g.,
validate_user_input,call_llm_api). - Start and End Timestamps: Precisely defines the operation's duration.
- Span Context: Contains the essential identifiers—a Trace ID to link all spans in a request and a unique Span ID.
- Attributes/Key-Value Pairs: Structured metadata providing context (e.g.,
{ "http.method": "POST", "agent.step": "planning" }). - Status: Typically a code (OK, ERROR) and optional description.
- Events: Timed, structured log messages within the span's lifetime.
- Links: References to causally related spans in other traces.
Spans are nested to represent parent-child relationships, forming a trace tree that visualizes the complete flow of a transaction across services and agents.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
A span is the fundamental building block of a distributed trace. To fully understand its role and implementation, it's essential to know the related concepts and technologies that define, collect, and process this data.
Distributed Tracing
Distributed tracing is the overarching methodology for observing and profiling requests as they flow through a distributed system. It tracks the full path, latency, and relationships between operations across multiple services and components.
- A trace is the complete end-to-end record of a single request.
- Spans are the individual, timed operations that compose a trace.
- This technique is critical for diagnosing latency issues and understanding service dependencies in microservices and agentic architectures.
OpenTelemetry (OTel)
OpenTelemetry is a vendor-neutral, open-source observability framework that provides the standardized tools to generate, collect, and export telemetry data, including spans. It is the de facto standard for instrumenting modern applications.
- Provides unified APIs and SDKs for creating spans and traces.
- Defines the OpenTelemetry Protocol (OTLP) for data transmission.
- Includes the OTel Collector for receiving, processing, and routing telemetry data.
Trace Context
Trace context is the metadata that propagates the necessary identifiers to correlate spans from different services into a single, coherent distributed trace. It is essential for maintaining the integrity of a request's journey.
- Contains the Trace ID (unique to the overall request) and Span ID (unique to the current operation).
- Often propagated via HTTP headers (following the W3C TraceContext standard) or RPC metadata.
- Ensures that all spans generated by downstream services are linked to the correct parent trace.
Span Attributes
Span attributes (or tags) are key-value pairs that provide descriptive, queryable metadata about a specific operation. They turn raw timing data into actionable, contextualized information.
- Examples:
http.method="GET",db.system="postgresql",agent.decision="retry",error=true. - Used for filtering, grouping, and analyzing trace data in observability backends.
- Critical for adding business context (e.g.,
user.id,transaction.amount) to operational telemetry.
Span Events
A span event (or annotation) is a structured log message attached to a specific point in time within a span's lifetime. It records notable occurrences during the operation's execution.
- Examples: Recording an exception stack trace, a checkpoint in a long-running process, or a state change within an agent's reasoning loop.
- Contains a timestamp and a set of attributes describing the event.
- Provides a finer-grained, time-series view of what happened during the span, not just at its start and end.
Sampling Strategy
A sampling strategy is a rule-based approach for selectively reducing the volume of trace data collected, balancing observability detail against system overhead and cost.
- Head-based sampling: The decision to sample is made at the start of the trace (e.g., sample 10% of all requests). Simple but can miss important, rare traces.
- Tail-based sampling: The decision is made after the trace completes, based on its properties (e.g., keep all traces with errors or latency > 1s). More intelligent but requires buffering.
- Essential for managing data volume in high-throughput agentic systems.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us