Glossary

Span Links

Span links are references from one span to another span in a different trace, used to represent causal relationships like batch processing or asynchronous triggers.

Get in touch Learn more

Large-scale analytics wall displaying performance trends and system relationships.

DISTRIBUTED TRACE COLLECTION

What is Span Links?

A mechanism in distributed tracing for connecting causally related spans across different traces.

Span links are explicit references from one span to another span in a separate trace, used to model causal relationships that are not parent-child dependencies. Unlike a parent-child relationship, which occurs within a single trace, a link connects spans across trace boundaries to represent asynchronous or batch-processed triggers, such as a message being published to a queue and later consumed. This is a core concept in OpenTelemetry and is essential for accurately modeling complex, event-driven architectures where work is decoupled.

In practice, a span can contain multiple links, each pointing to a span context (containing trace ID and span ID) from another trace. This allows observability backends to reconstruct and visualize workflows that span multiple independent requests. For agentic systems, links are critical for tracing the lifecycle of a task as it triggers subsequent autonomous actions, enabling full end-to-end tracing of asynchronous, multi-step business processes that traditional hierarchical traces cannot capture.

DISTRIBUTED TRACE COLLECTION

Key Characteristics of Span Links

Span links are references from one span to another span in a different trace, used to represent causal relationships like batch processing or asynchronous triggers. Unlike parent-child relationships, links connect spans across trace boundaries.

Cross-Trace Relationship

A span link establishes a causal or reference relationship between a span in the current trace and a span in a different, independent trace. This is distinct from a parent-child relationship, which exists within a single trace.

Primary Use: Modeling asynchronous or batch processes where one operation triggers another without a direct, synchronous call.
Example: A batch job (Trace A) processes 10,000 records, each triggering an API call. The batch job span links to the 10,000 individual API call spans (in Traces B1-B10,000).

Zero Impact on Trace Duration

Linking spans does not affect the timing calculations of either trace. The linked-to span's start and end times are independent.

Key Distinction: A parent span's duration includes all its child spans' durations. A linking span's duration is unaffected by the spans it links to.
Implication for Analysis: You cannot sum durations across linked traces to calculate an 'end-to-end' time. Links indicate causality, not a continuous timing path.

Attribute-Based Context

Links carry span context and attributes from the linked-to span. This context includes the Trace ID, Span ID, Trace State, and any relevant attributes from the source span.

Propagated Data: trace_id, span_id, trace_state, and a set of attributes from the linked span.
Use Case: Enriches the linking span with metadata about the cause. For example, a span for a triggered Lambda function could have a link containing the job_id and input_file attributes from the batch job that triggered it.

Modeling Asynchronous Workflows

This is the canonical use case for span links. They excel at representing event-driven and message-based architectures.

Message Queues: A span representing publishing a message to Kafka/RabbitMQ can link to the span representing the consumer's processing of that message (in a separate trace).
Event Triggers: A span for a database update can link to a span in a separate trace where a change-data-capture (CDC) listener triggers a downstream service.
Batch Processing: As in the primary example, one-to-many triggering.

OpenTelemetry Specification

Span links are a core concept in the OpenTelemetry (OTel) tracing specification. They are created using the API's addLink() method during span creation.

API Method: Span.addLink(SpanContext context, Attributes attributes)
Limitation: Links can only be added at span creation time, not afterward. This ensures the linked relationship is declared when the caused activity begins.
Standardization: Being part of OTel ensures vendor-agnostic implementation across different tracing backends (Jaeger, Tempo, etc.).

Visualization & Backend Support

Not all tracing backends visualize or fully utilize span link data. Support varies.

Advanced Backends: Systems like Jaeger and Honeycomb can visualize links, often showing them as dotted lines or enabling navigation between linked traces in their UI.
Analysis Value: Enables powerful querying: "Show all traces linked to this batch job ID" or "Find all errors in traces triggered by this queue message."
Implementation Check: When adopting links, verify your tracing backend's query and visualization capabilities for linked data.

DISTRIBUTED TRACE COLLECTION

How Span Links Work in Practice

Span links are a mechanism in distributed tracing for establishing causal relationships between spans that belong to different, independent traces.

A span link is a reference from a span in one trace to a span in another trace, used to model causal relationships that are not strict parent-child dependencies. Unlike a parent-child relationship, which creates a hierarchy within a single trace, a link creates a directed association between two distinct traces. This is essential for representing asynchronous or batch-processing workflows, where one operation (e.g., a message being published) triggers another, separate operation (e.g., a message being processed) without a continuous synchronous call chain. The link is stored as an attribute on the 'child' span, pointing back to the context of the 'parent' span.

In practice, links are implemented by extracting and storing the span context (trace ID, span ID) of the causal operation. Common use cases include linking a Kafka consumer span to the producer span that created the message, or connecting a batch job execution span to the individual request spans that queued the work. Observability backends use these links to navigate between related traces, providing a complete view of complex, event-driven architectures. This allows engineers to debug issues that propagate across asynchronous boundaries, which traditional parent-child tracing cannot capture.

DISTRIBUTED TRACE COLLECTION

Common Use Cases for Span Links

Span links are not just a data structure; they are a critical tool for modeling complex, asynchronous, and batch-oriented workflows in modern distributed systems. They enable observability platforms to reconstruct causal relationships that traditional parent-child spans cannot capture.

Modeling Batch Processing

Span links are essential for representing the causal relationship between a batch job's initiation and the individual units of work it processes. A single parent span for the batch controller can link to hundreds of child spans in separate traces for each processed item (e.g., an image, a message, a database record). This structure:

Preserves trace independence: Each item's processing is its own trace, with its own error and latency profile.
Maintains causality: The batch job trace links to all item traces, showing the origin without creating a monolithic, unwieldy parent span.
Enables root-cause analysis: If a batch fails, engineers can quickly navigate from the failing batch trace to the specific linked item trace that caused the error.

Tracing Asynchronous Triggers

In event-driven architectures, a span link connects a triggering event to the execution it initiates, which often runs in a completely different process or service. Common patterns include:

Message Queue Processing: A span in the "publisher" service that places a message on a queue (e.g., Kafka, RabbitMQ) can link to the span in the "consumer" service that processes it, even if hours later.
Workflow Orchestration: An orchestrator (e.g., Airflow, Temporal) that triggers a remote task execution can link to the trace of that execution.
Deferred Jobs: A web request that schedules a background job (e.g., via Celery) links to the trace of the job worker. This provides a complete asynchronous causality chain, crucial for debugging systems where work is decoupled in time and space.

Representing Fan-out Operations

When a single operation triggers multiple parallel downstream calls to different services, span links model this fan-out pattern cleanly. The initiating span (e.g., an API gateway or aggregator) creates links to the traces of each parallel call.

Avoids timing distortion: Linking, rather than parenting, prevents the parent span's duration from being artificially extended to cover all parallel child executions.
Clarity in visualization: In a flame graph or trace view, the links show the parallel nature of the work, unlike nested spans which imply sequential execution.
Example: A product page load might fan out to parallel calls for user profile, inventory, and recommendation services. The root span links to these three independent service traces.

Connecting Logically Related Traces

Span links create semantic relationships between traces that share a business context but not a direct synchronous call chain. This is vital for business transaction tracing.

User Journey Mapping: Link a user's login trace to their subsequent checkout trace, even if they are separated by minutes of browsing.
Long-Running Processes: Connect traces from different stages of a multi-step business process (e.g., loan application: submission -> underwriting -> approval).
Cross-Request State: Associate traces that all interact with the same entity, like a document ID or a shopping cart token. This transforms traces from isolated technical artifacts into a continuous narrative of business activity.

Debugging Cascading Failures

In failure scenarios, especially those involving retries, dead-letter queues, or compensating transactions, span links provide the audit trail needed for forensic analysis.

Retry Loops: Link each retry attempt's trace back to the original failed request trace.
Dead-Letter Queue (DLQ) Analysis: When a failed message is moved to a DLQ, a link connects the original processing trace to the trace that handled the DLQ notification or manual remediation.
Compensating Transactions: In Saga patterns, if a transaction fails and a rollback is triggered, links can connect the failed operation trace to the compensating action trace. This linked history is critical for SREs to understand failure propagation and recovery paths.

Integration with OpenTelemetry

The OpenTelemetry (OTel) specification formally defines span links, making them a portable, vendor-neutral construct. Key OTel concepts:

Link Interface: Defined with a SpanContext (of the linked-to span) and attributes.
Context Propagation: The SpanContext in a link contains the essential identifiers (trace_id, span_id) needed to retrieve the linked trace from a backend.
Backend Support: Observability backends like Jaeger, Zipkin, and commercial APMs use these links to build interactive trace graphs that users can navigate. Using OTel ensures span links provide interoperable causality data across any instrumented service in your stack.

EXPLORE

DISTRIBUTED TRACE RELATIONSHIPS

Span Links vs. Parent-Child Relationships

A comparison of the two primary mechanisms for connecting spans in distributed tracing, highlighting their distinct purposes and technical characteristics.

Feature	Parent-Child Relationship	Span Link
Primary Purpose	Models synchronous, causal execution flow within a single trace.	Models asynchronous, causal relationships between spans in different traces.
Trace Context	Spans share the same Trace ID.	Spans have different Trace IDs.
Structural Model	Forms a Directed Acyclic Graph (DAG) hierarchy within a trace.	Forms a directed graph of causal references across trace boundaries.
Timing Relationship	Child span's start time is within the parent span's duration.	No inherent timing constraint; linked spans may be concurrent or sequential.
Causality	Represents direct, often synchronous, causation (e.g., a function call).	Represents indirect, often asynchronous, causation (e.g., a message queued for batch processing).
Use Case Example	An HTTP server span calling a database, creating a child span for the query.	A span in a batch job processor linking to the span that originally enqueued the work item.
OpenTelemetry Span Kind Pairing	Typically involves Client/Server or Producer/Consumer pairs.	Can link any span kind; often used with Producer/Consumer or Internal spans.
Data Volume Impact	Increases the depth and complexity of a single trace.	Creates a network of related traces, increasing cross-trace analysis complexity.
Backend Visualization	Nested within a single flame graph or trace view.	Displayed as connected nodes in a trace graph or via dedicated link navigation.

SPAN LINKS

Frequently Asked Questions

Span links are a core concept in distributed tracing for representing causal relationships across different execution flows. These questions address their purpose, mechanics, and practical use cases.

A span link is a reference from one span to another span that exists in a different trace, used to represent a causal relationship between distinct units of work that are not directly connected by a parent-child hierarchy. Unlike a parent-child relationship, which exists within a single trace, a link connects spans across trace boundaries to model asynchronous or batch-processing workflows. The linked span is known as the linked context. This mechanism is essential for accurately modeling complex distributed system interactions where a single action (like publishing a message) can trigger multiple, independent downstream processes.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

DISTRIBUTED TRACE COLLECTION

Related Terms

Span links are a core concept within distributed tracing. To fully understand their role, it's essential to grasp the related primitives and systems that define the observability landscape.

Span

A span is the fundamental unit of work in distributed tracing, representing a single, named, and timed operation within a service. It is the basic building block from which traces are constructed.

Key Properties: Contains a start/end timestamp, operation name, span ID, and span attributes.
Parent-Child Relationships: Spans are linked hierarchically within a single trace to show causal execution flow (e.g., a database query span is a child of an API handler span).
Contrast with Links: While parent-child links exist within a trace, span links connect spans across different traces, representing a different type of causal relationship.

Trace

A trace is a directed acyclic graph (DAG) of spans that represents the complete end-to-end path of a single request or transaction as it flows through a distributed system.

Trace ID: All spans within a trace share a globally unique trace ID for correlation.
Visualization: Often visualized as a flame graph, where the width of bars represents span duration and nesting shows the call hierarchy.
Link Context: Span links create references between spans that reside in separate traces, allowing you to model relationships like one trace triggering another asynchronously.

Span Context

Span context is the immutable, portable state that must be propagated across process boundaries to enable distributed tracing. It contains the minimal data needed to identify and correlate work.

Core Components: Includes the trace ID, the current span ID, trace flags (for sampling), and trace state (for vendor-specific data).
Propagation: This context is carried via distributed context propagation mechanisms like W3C Trace Context headers in HTTP requests or metadata in messaging queues.
Link Foundation: To create a span link, you must capture the span context of the span you wish to link to, typically from a message or event payload.

OpenTelemetry (OTel)

OpenTelemetry (OTel) is the open-source, vendor-neutral standard for generating, collecting, and exporting telemetry data, including traces, metrics, and logs. It provides the canonical APIs and SDKs for implementing tracing.

Span Link API: OTel's API has first-class support for adding span links to a span during its creation.
Standardized Semantics: Defines standard span attributes and span kinds that provide crucial context for linked spans.
Protocol & Collector: Uses OTLP (OpenTelemetry Protocol) for transport and the OpenTelemetry Collector for processing, where tail sampling policies can make decisions based on linked trace data.

Tail Sampling

Tail sampling is a sampling strategy where the decision to retain or discard a trace is made after the request is complete, based on the full set of collected span data and attributes.

Decision Criteria: Can sample traces based on attributes like high latency, error status, specific business logic outcomes, or the presence of certain span links.
Use Case for Links: A powerful application is to sample all traces that are linked to a trace containing a critical error, ensuring the complete causal chain is preserved for root cause analysis, even if the linked traces would have been dropped by probabilistic head sampling.

Service Graph

A service graph is a topological map of a distributed system, automatically derived from trace data, that shows services as nodes and the request flows between them as edges.

Dynamic Dependency Mapping: Reveals service dependencies and call patterns, which is vital for architecture reviews and impact analysis.
Beyond Direct Calls: While built from parent-child span relationships, advanced service graph implementations can also incorporate data from span links to visualize indirect or asynchronous relationships between services, such as event-driven triggers or batch processing workflows that don't follow a synchronous request-response pattern.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

Span Links

What is Span Links?

Key Characteristics of Span Links

Cross-Trace Relationship

Zero Impact on Trace Duration

Attribute-Based Context

Modeling Asynchronous Workflows

OpenTelemetry Specification

Visualization & Backend Support

How Span Links Work in Practice

Common Use Cases for Span Links

Modeling Batch Processing

Tracing Asynchronous Triggers

Representing Fan-out Operations

Connecting Logically Related Traces

Debugging Cascading Failures

Integration with OpenTelemetry

Span Links vs. Parent-Child Relationships

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there