Inferensys

Glossary

ROS Bag

A ROS Bag is a file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior.
Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.
DATA LOGGING

What is a ROS Bag?

A ROS Bag is the standard file format for recording and replaying message data in the Robot Operating System (ROS) ecosystem.

A ROS Bag is a container file format that records serialized message data published on ROS topics, enabling the logging, offline analysis, and deterministic replay of a robotic system's runtime sensor data, commands, and state. It functions as a multidimensional data recorder, capturing the temporal sequence of asynchronous communications across the ROS graph. This allows developers to debug complex interactions, train machine learning models, and perform regression testing without requiring continuous access to physical hardware.

The format is optimized for sequential read/write operations and supports compression and chunking for efficient storage. During playback, a tool like rosbag play republishes the recorded messages onto their respective topics, recreating the original data flow with configurable timing. This capability is fundamental for simulation, sensor data annotation, and validating algorithms against ground-truth datasets. In ROS 2, the bag format (db3) and its API (rosbag2) were redesigned for improved performance and integration with the underlying Data Distribution Service (DDS) middleware.

DATA LOGGING

Core Characteristics of ROS Bags

A ROS Bag is a file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior. Its core characteristics define its utility, performance, and integration within the ROS ecosystem.

01

Topic-Based Selective Recording

ROS Bags do not record the entire system state but instead capture data from specific ROS Topics. This allows engineers to log only the necessary sensor data (e.g., /camera/image_raw, /scan) and command streams, minimizing file size and focusing analysis.

  • Selective Capture: Use the ros2 bag record -t <topic_name> command to record specific topics.
  • Efficiency: Avoids logging internal debug topics, conserving storage.
  • Use Case: Essential for isolating data from a malfunctioning perception node without capturing irrelevant actuator commands.
02

Time-Synchronized Playback

During playback, a ROS Bag publishes messages onto the live ROS graph with the same timestamps and inter-message delays as during the original recording. This creates a deterministic, repeatable simulation of past system behavior.

  • Deterministic Replay: Enables exact reproduction of sensor streams for debugging intermittent faults.
  • Simulation Input: A recorded bag file can serve as a high-fidelity sensor input for testing new perception algorithms offline.
  • Tool: The ros2 bag play command handles this synchronization, optionally accelerating or looping the playback.
04

Integration with ROS 2 Tools

ROS Bags are a first-class citizen within the ROS 2 tooling ecosystem, enabling a seamless workflow from recording to visualization and analysis.

  • CLI Tools: ros2 bag is the primary command-line interface for recording, playing, and inspecting bags.
  • RViz Visualization: Played-back topics can be visualized in RViz just like live data, for replaying sensor feeds and robot trajectories.
  • rqt_bag GUI: Provides a graphical interface for inspecting message timelines, contents, and plotting data.
  • Python API (rosbag2_py): Allows programmatic bag creation, reading, and data extraction for custom analysis pipelines.
05

Performance and Storage Considerations

Recording high-frequency sensor data (e.g., images, point clouds) presents significant I/O and storage challenges. ROS Bag configuration is critical for managing system load.

  • Compression: MCAP format supports per-topic compression to reduce file size (critical for video streams).
  • Buffer Management: The recorder uses a configurable cache to buffer messages before writing to disk, preventing drops during I/O spikes.
  • Storage Options: Can write to fast SSDs for high-bandwidth logging or network storage for long-term archiving.
  • Split Files: Bags can be automatically split by size or duration to manage file handles and facilitate processing.
06

Primary Use Cases in Development

ROS Bags are indispensable throughout the robotic software lifecycle, serving specific roles in development, testing, and deployment.

  • Offline Debugging: Record a problematic run, then repeatedly analyze sensor data and internal state to isolate bugs.
  • Algorithm Regression Testing: Use a standard set of "golden" bag files as test fixtures to validate that perception or planning changes do not degrade performance.
  • Documentation & Sharing: Bags provide a reproducible dataset for sharing with team members or for publication, ensuring everyone analyzes the exact same sensor inputs.
  • Simulation Bridging: Record data from a physical robot to create a realistic simulation scenario, or play simulated data into a real robot's software stack for Hardware-in-the-Loop (HIL) testing.
ROS BAG

How ROS Bags Work: Recording and Playback

A ROS Bag is the primary file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior.

A ROS Bag is a specialized file format for recording serialized message data published on ROS topics. It functions as a passive subscriber, capturing a timestamped log of all communication on specified topics without interfering with the live system. This recorded bag file can later be replayed, publishing the stored messages in chronological order to simulate the original data flow for offline testing, algorithm development, and post-mortem analysis of robotic experiments.

Playback is managed by the rosbag2 utilities in ROS 2, which allow for flexible control over the publishing rate, specific topics, and time ranges. The system uses a SQLite3 database backend for efficient storage and indexing. This capability is fundamental for debugging perception pipelines, validating state estimation algorithms, and creating reproducible datasets for training machine learning models without requiring constant access to physical hardware or sensor streams.

ROBOT OPERATING SYSTEM (ROS)

Primary Use Cases for ROS Bags

A ROS Bag is a file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior. Its primary applications are foundational to the development and deployment lifecycle.

01

Offline Debugging and Post-Mortem Analysis

This is the most fundamental use case. Engineers record sensor data and internal system state during field tests or lab experiments. The bag file acts as a deterministic log of the system's execution, allowing them to:

  • Replay the exact sensor inputs and messages to reproduce bugs.
  • Inspect the timing and content of every message on every topic using tools like rqt_bag or ros2 bag info.
  • Isolate failures by correlating anomalous robot behavior with specific sensor readings or internal state changes that occurred seconds or minutes prior.
02

Algorithm Development and Training Data Collection

ROS Bags provide the raw, time-synchronized multimodal data required to develop and train perception and planning algorithms.

  • Perception Models: Record synchronized streams from cameras, LiDAR, and IMUs to train object detection, semantic segmentation, or SLAM models without needing the physical robot present.
  • Imitation Learning: Capture topic data (e.g., /cmd_vel, /joint_states) during expert demonstrations to create datasets for behavioral cloning.
  • Benchmarking: Create standard bag files with challenging scenarios (e.g., dynamic obstacles, poor lighting) to serve as a consistent benchmark for evaluating different algorithm versions.
03

System Integration and Regression Testing

Bags enable continuous integration (CI) pipelines for robotic software by providing reproducible, hardware-independent test scenarios.

  • CI/CD Pipelines: A test node subscribes to a bag's playback and validates that new code produces the expected outputs (e.g., correct object detections, planned paths).
  • Regression Detection: By comparing the outputs of a new system version against a golden master bag recorded from a known-good version, engineers can detect subtle behavioral regressions.
  • Hardware-in-the-Loop (HIL): Bags can feed simulated sensor data to a system partially running on real control hardware, testing the integration of physical actuators with new perception software.
04

Simulation and Digital Twin Validation

Recorded real-world data is crucial for creating and validating high-fidelity simulations.

  • Simulation Ground Truth: A bag from a physical robot provides the ground truth sensor readings and robot states needed to calibrate and validate physics simulators (e.g., Gazebo, Isaac Sim), ensuring the sim's outputs match reality.
  • Digital Twin Synchronization: Play back a bag into a simulation to create a synchronized digital twin of a past real-world operation, enabling detailed forensic analysis in a risk-free virtual environment.
  • Sim-to-Real Gap Analysis: By running the same algorithm on both a bag (real data) and a simulated replica, engineers can quantify the reality gap and refine their sim-to-Real transfer techniques.
05

Documentation and Knowledge Sharing

A well-annotated ROS Bag serves as a canonical artifact that captures a specific robotic capability or test scenario.

  • Reproducible Research: In academia and industrial R&D, publishing the bag file alongside a paper allows peers to exactly reproduce experimental results and verify claims.
  • Team Handoff: A new engineer can understand system behavior by replaying key scenario bags, seeing the exact data flows that experienced developers debugged.
  • Demonstration Archives: Bags record successful complex maneuvers (e.g., a door opening, a dynamic obstacle avoidance sequence) for stakeholder reviews, without needing to stage a live demo.
06

Performance Profiling and System Characterization

Bags allow engineers to analyze the temporal behavior and resource usage of their ROS graph under realistic load.

  • Latency Analysis: Tools can process bags to measure end-to-end latency from a sensor message (e.g., /camera/image_raw) to a resulting command (e.g., /cmd_vel).
  • Network Load Assessment: By analyzing message rates and sizes across topics in a bag, engineers can characterize network bandwidth usage and identify potential bottlenecks before deployment.
  • Deterministic Replay for Profiling: Code profilers (e.g., ros2_tracing) can be run during a bag replay to isolate CPU-intensive callbacks under a consistent, repeatable data load, unlike variable live tests.
COMPARISON

ROS 1 vs. ROS 2 Bag Formats

A technical comparison of the core architectural differences, capabilities, and limitations between the ROS 1 and ROS 2 bag file formats for recording and playing back robotic system data.

Feature / MetricROS 1 Bag FormatROS 2 Bag Format (MCAP Default)

Primary File Format

Custom binary (.bag)

MCAP (.mcap) or SQLite3 (.db3)

Underlying Architecture

Custom ROS 1 serialization & TCPROS/UDPROS transport

Pluggable storage backend; MCAP is a container format

Metadata & Indexing

Basic header; linear index for time

Rich, extensible metadata; efficient chunked indexing

Data Compression

Per-message (BZ2, LZ4)

Per-chunk (Zstandard, LZ4) and optional per-message

Concurrent Read/Write

Reliability & Corruption Recovery

Limited; corruption can invalidate entire file

Robust; MCAP's chunked design isolates corruption

Quality of Service (QoS) Policy Recording

Schema Evolution Support

Limited; requires careful message versioning

Native in MCAP via embedded .msg/.idl schemas

Performance (Write Speed)

~80-90% of network rate

95% of network rate with Zstd compression

Performance (Random Access Seek)

Moderate (linear index)

Fast (chunked index with summary section)

Interoperability & Tooling

Mature rosbag CLI; limited external tool support

Growing ros2 bag CLI; MCAP has broad external tool support (e.g., Foxglove)

Recommended Use Case

Legacy ROS 1 systems; simple recording/playback

Modern, production-grade systems; data integrity, analysis, and long-term archiving

ROS BAG

Frequently Asked Questions

A ROS Bag is the primary file format for recording and playing back ROS topic data. This FAQ addresses common questions about its purpose, mechanics, and best practices for robotics software engineers.

A ROS Bag is a file format, with the .bag extension, used to record and serialize messages published on ROS Topics for later playback and analysis. It functions as a time-stamped log of a robotic system's sensor data, internal state, and command streams, enabling offline debugging, algorithm development, and system validation without requiring live hardware.

  • Core Purpose: Provides deterministic replay of sensor and system data.
  • File Format: Uses a custom binary format for efficient storage of serialized ROS Messages.
  • Key Tool: The primary command-line utilities are ros2 bag record and ros2 bag play (or their ROS 1 equivalents, rosbag record/play).
Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.