Glossary

DataDog Agent

The Datadog Agent is a lightweight software package installed on hosts that collects events and metrics, forwards them to the Datadog platform, and executes checks for integrations and custom monitoring.

Get in touch Learn more

Developer reviewing multi-agent chat interface on laptop, agent conversation logs visible, casual coding session at WeWork desk.

AGENT TELEMETRY PIPELINES

What is DataDog Agent?

A core component of the Datadog observability platform responsible for collecting and forwarding telemetry data from hosts.

The Datadog Agent is a lightweight, open-source software daemon installed on monitored hosts that collects metrics, traces, and logs, executes integration checks, and forwards this telemetry data to the Datadog platform for analysis. It operates as the foundational data collection layer, enabling comprehensive observability by gathering system-level data (CPU, memory) and application-specific signals through its modular architecture and extensive library of out-of-the-box integrations.

The agent provides real-time data collection with minimal overhead, supports custom checks written in Python, and can be deployed across diverse environments including bare metal, virtual machines, containers, and Kubernetes via a DaemonSet. It manages data enrichment with tags, handles secure communication to Datadog's backend, and works in tandem with the Datadog Trace Agent and Process Agent for advanced APM and infrastructure monitoring, forming a unified pipeline for agentic observability.

DATADOG AGENT

Key Features and Capabilities

The Datadog Agent is a lightweight, open-source software package installed on hosts to collect observability data. It functions as the universal data collector for the Datadog platform, gathering metrics, traces, logs, and events.

Unified Data Collection

The Agent provides a single, integrated daemon for collecting all major telemetry signals. It gathers system metrics (CPU, memory, disk I/O), application performance monitoring (APM) traces via its tracing library, logs from files, network ports, or journald, and integration metrics from over 650 supported technologies. This unified collection eliminates the need for multiple, disparate agents, simplifying deployment and configuration management.

Out-of-the-Box Integrations

A core capability is its extensive library of built-in checks. These are pre-configured plugins that automatically collect metrics and service checks from popular technologies. Examples include:

Databases: PostgreSQL, Redis, MongoDB
Web Servers: Nginx, Apache
Cloud Platforms: AWS, Google Cloud, Azure
Container Orchestrators: Kubernetes, Docker Each check parses standard endpoints or APIs, transforming vendor-specific metrics into a normalized format for the Datadog platform.

Autodiscovery for Dynamic Environments

In containerized environments like Kubernetes, the Agent uses Autodiscovery to automatically identify services running in pods and apply the correct monitoring configuration. It monitors the container runtime or orchestrator API for new, updated, or terminated containers. When a new container starts, Autodiscovery matches its identifiers (e.g., image name, container labels, Kubernetes annotations) against check templates and dynamically enables the appropriate monitoring, ensuring no ephemeral workload goes unobserved.

Local Aggregation and Forwarding

The Agent performs critical preprocessing on the host before data transmission. It aggregates metrics at configurable intervals (default: 15 seconds), reducing the total number of data points. For traces and logs, it batches payloads and compresses them. It then forwards this processed data to Datadog's intake endpoints via a resilient, HTTPS-based protocol with automatic retry logic and local buffering to withstand network interruptions, ensuring at-least-once delivery of telemetry.

Live Processes & Continuous Profiling

Beyond standard metrics, the Agent can collect deep runtime diagnostics. The Live Processes feature discovers and monitors all processes running on a host, collecting their command, resource consumption, and lineage. The Continuous Profiler (an integrated component) samples CPU usage, memory allocation, and wall time at the method level for supported languages (Java, Python, Go, etc.), generating flame graphs that pinpoint code-level performance bottlenecks with minimal overhead (<2% typical).

Security & Network Monitoring

The Agent includes modules for security and network observability. The Security Agent performs runtime security detection, scanning for file integrity changes, suspicious process activity, and potential compliance violations based on rules. The Network Performance Monitoring (NPM) module uses eBPF (on Linux) to trace TCP/UDP communications between services at the kernel level, providing topology maps, connection latency, and throughput metrics without requiring application code changes or port mirroring.

AGENT TELEMETRY PIPELINES

How the Datadog Agent Works

The Datadog Agent is the core data collection engine for the Datadog observability platform, responsible for gathering, processing, and forwarding telemetry from hosts and containers.

The Datadog Agent is a lightweight, open-source software daemon installed on hosts that collects metrics, traces, logs, and events, forwarding them to the Datadog platform. It operates via a collection of checks—modular integrations—that gather data from specific systems like databases, web servers, or custom applications. The agent runs continuously, providing real-time visibility into infrastructure and application health without requiring constant manual intervention.

Architecturally, the agent consists of a core process and several forwarder and collector subsystems. It aggregates data locally, applies tagging for context, and batches payloads for efficient transmission over HTTPS or via a proxy. For containerized environments, it is typically deployed as a DaemonSet in Kubernetes, ensuring one instance per node. The agent also executes live processes and continuous profiling, enabling deep diagnostic capabilities directly from the monitored system.

DATADOG AGENT

Frequently Asked Questions

Essential questions about the Datadog Agent, the open-source software that collects observability data from your infrastructure and applications.

The Datadog Agent is a lightweight, open-source software package installed on hosts that collects observability data—metrics, traces, logs, and events—and forwards them to the Datadog platform. It operates as a persistent background service (daemon) that executes checks for integrations, runs custom scripts, and aggregates data. The agent uses a pull model for some integrations (querying APIs) and a push model for others (receiving stats from applications), ultimately batching and transmitting data securely to Datadog's intake endpoints via HTTPS or a proxy. Its modular architecture includes a core agent and optional components like the APM (Application Performance Monitoring) tracer and Process Agent.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

AGENT TELEMETRY PIPELINES

Related Terms

The Datadog Agent operates within a broader ecosystem of telemetry collection and processing. These related concepts define the tools, patterns, and protocols that enable comprehensive observability.

OpenTelemetry (OTel)

A vendor-neutral, open-source observability framework providing unified APIs, SDKs, and tools to generate, collect, and export telemetry data (traces, metrics, logs). It standardizes instrumentation, allowing data to be sent to multiple backends, including Datadog, via the OpenTelemetry Protocol (OTLP). Key components include:

OTel Collector: A vendor-agnostic proxy for receiving, processing, and exporting telemetry.
Auto-Instrumentation: Automatically adding observability code without source changes.
W3C TraceContext: The standard for propagating trace context across services.

EXPLORE

Distributed Tracing

A method of profiling requests as they flow through a distributed system, tracking the full path, latency, and relationships between operations. It is composed of spans, which are individual timed operations. The Datadog Agent can collect trace data from instrumented applications. Critical concepts include:

Trace Context: Metadata (e.g., trace ID, span ID) propagated across service boundaries.
Sampling Strategies: Rules to reduce data volume, such as head-based (decision at start) or tail-based (decision after request completion) sampling.
Enables performance debugging across microservices and serverless functions.

Sidecar Pattern & DaemonSet

Deployment models for auxiliary software like observability agents. The Datadog Agent can be deployed using both patterns:

Sidecar Pattern: The agent runs in a separate container alongside the main application container in a Kubernetes pod. This provides isolation and allows the agent to be updated independently.
DaemonSet: A Kubernetes controller that ensures a copy of the Datadog Agent pod runs on every node (or a subset) in the cluster. This is efficient for collecting host-level metrics (CPU, memory, disk) from all nodes.

Telemetry Data Pipeline

The end-to-end system for moving observability data from sources to backends. The Datadog Agent is a source collector within this pipeline. Related components include:

Data Enrichment: Adding context (tags, environment) to raw metrics and traces.
Backpressure Handling: Managing flow when the backend is slower than the data source.
Dead Letter Queue (DLQ): A holding area for events that fail repeated processing attempts.
Alternative pipeline tools include Vector.dev (high-performance router), Fluentd (unified logging), and the OTel Collector.

Metric Collection & Export

The process of gathering quantitative measurements about system behavior. The Datadog Agent collects host metrics and application metrics via integrations and DogStatsD. Related protocols and systems:

StatsD: A simple UDP-based protocol for sending application metrics (counters, timers, gauges). Datadog's DogStatsD is an extension.
Prometheus: An open-source monitoring system using a pull model over HTTP. The Datadog Agent can be configured as a Prometheus exporter or scrape Prometheus endpoints.
Metric Exporter: The component within an SDK that sends aggregated metrics to a backend.

Continuous Profiling & eBPF

Advanced observability techniques that provide deep system insights with low overhead.

Continuous Profiling: Automatically collecting application performance profiles (CPU, memory, I/O) over time. Tools like Pyroscope offer this; Datadog provides Continuous Profiler as a feature.
eBPF Tracing: A Linux kernel technology that allows safe program execution in the kernel for deep observability. It can trace system calls, network traffic, and kernel functions. The Datadog Agent uses eBPF for network performance monitoring, security, and kernel-level profiling without requiring application changes.

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.

Talk to Us

DataDog Agent

What is DataDog Agent?

Key Features and Capabilities

Unified Data Collection

Out-of-the-Box Integrations

Autodiscovery for Dynamic Environments

Local Aggregation and Forwarding

Live Processes & Continuous Profiling

Security & Network Monitoring

How the Datadog Agent Works

Frequently Asked Questions

Intelligent Analysis, Decision & Execution

Search across company data

Automate internal workflows

Add AI to products and internal tools

OpenTelemetry (OTel)

Prasad Kumkar

Partnered with leading AI, data, and software stack.

Custom AI workflows for your Business

Review the use case

Pick the right approach

Build the first useful version

Improve from there