A ROS Bag is a container file format that records serialized message data published on ROS topics, enabling the logging, offline analysis, and deterministic replay of a robotic system's runtime sensor data, commands, and state. It functions as a multidimensional data recorder, capturing the temporal sequence of asynchronous communications across the ROS graph. This allows developers to debug complex interactions, train machine learning models, and perform regression testing without requiring continuous access to physical hardware.
Glossary
ROS Bag

What is a ROS Bag?
A ROS Bag is the standard file format for recording and replaying message data in the Robot Operating System (ROS) ecosystem.
The format is optimized for sequential read/write operations and supports compression and chunking for efficient storage. During playback, a tool like rosbag play republishes the recorded messages onto their respective topics, recreating the original data flow with configurable timing. This capability is fundamental for simulation, sensor data annotation, and validating algorithms against ground-truth datasets. In ROS 2, the bag format (db3) and its API (rosbag2) were redesigned for improved performance and integration with the underlying Data Distribution Service (DDS) middleware.
Core Characteristics of ROS Bags
A ROS Bag is a file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior. Its core characteristics define its utility, performance, and integration within the ROS ecosystem.
Topic-Based Selective Recording
ROS Bags do not record the entire system state but instead capture data from specific ROS Topics. This allows engineers to log only the necessary sensor data (e.g., /camera/image_raw, /scan) and command streams, minimizing file size and focusing analysis.
- Selective Capture: Use the
ros2 bag record -t <topic_name>command to record specific topics. - Efficiency: Avoids logging internal debug topics, conserving storage.
- Use Case: Essential for isolating data from a malfunctioning perception node without capturing irrelevant actuator commands.
Time-Synchronized Playback
During playback, a ROS Bag publishes messages onto the live ROS graph with the same timestamps and inter-message delays as during the original recording. This creates a deterministic, repeatable simulation of past system behavior.
- Deterministic Replay: Enables exact reproduction of sensor streams for debugging intermittent faults.
- Simulation Input: A recorded bag file can serve as a high-fidelity sensor input for testing new perception algorithms offline.
- Tool: The
ros2 bag playcommand handles this synchronization, optionally accelerating or looping the playback.
Integration with ROS 2 Tools
ROS Bags are a first-class citizen within the ROS 2 tooling ecosystem, enabling a seamless workflow from recording to visualization and analysis.
- CLI Tools:
ros2 bagis the primary command-line interface for recording, playing, and inspecting bags. - RViz Visualization: Played-back topics can be visualized in RViz just like live data, for replaying sensor feeds and robot trajectories.
rqt_bagGUI: Provides a graphical interface for inspecting message timelines, contents, and plotting data.- Python API (
rosbag2_py): Allows programmatic bag creation, reading, and data extraction for custom analysis pipelines.
Performance and Storage Considerations
Recording high-frequency sensor data (e.g., images, point clouds) presents significant I/O and storage challenges. ROS Bag configuration is critical for managing system load.
- Compression: MCAP format supports per-topic compression to reduce file size (critical for video streams).
- Buffer Management: The recorder uses a configurable cache to buffer messages before writing to disk, preventing drops during I/O spikes.
- Storage Options: Can write to fast SSDs for high-bandwidth logging or network storage for long-term archiving.
- Split Files: Bags can be automatically split by size or duration to manage file handles and facilitate processing.
Primary Use Cases in Development
ROS Bags are indispensable throughout the robotic software lifecycle, serving specific roles in development, testing, and deployment.
- Offline Debugging: Record a problematic run, then repeatedly analyze sensor data and internal state to isolate bugs.
- Algorithm Regression Testing: Use a standard set of "golden" bag files as test fixtures to validate that perception or planning changes do not degrade performance.
- Documentation & Sharing: Bags provide a reproducible dataset for sharing with team members or for publication, ensuring everyone analyzes the exact same sensor inputs.
- Simulation Bridging: Record data from a physical robot to create a realistic simulation scenario, or play simulated data into a real robot's software stack for Hardware-in-the-Loop (HIL) testing.
How ROS Bags Work: Recording and Playback
A ROS Bag is the primary file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior.
A ROS Bag is a specialized file format for recording serialized message data published on ROS topics. It functions as a passive subscriber, capturing a timestamped log of all communication on specified topics without interfering with the live system. This recorded bag file can later be replayed, publishing the stored messages in chronological order to simulate the original data flow for offline testing, algorithm development, and post-mortem analysis of robotic experiments.
Playback is managed by the rosbag2 utilities in ROS 2, which allow for flexible control over the publishing rate, specific topics, and time ranges. The system uses a SQLite3 database backend for efficient storage and indexing. This capability is fundamental for debugging perception pipelines, validating state estimation algorithms, and creating reproducible datasets for training machine learning models without requiring constant access to physical hardware or sensor streams.
Primary Use Cases for ROS Bags
A ROS Bag is a file format for recording and playing back ROS topic data, enabling offline debugging, analysis, and simulation of robotic system behavior. Its primary applications are foundational to the development and deployment lifecycle.
Offline Debugging and Post-Mortem Analysis
This is the most fundamental use case. Engineers record sensor data and internal system state during field tests or lab experiments. The bag file acts as a deterministic log of the system's execution, allowing them to:
- Replay the exact sensor inputs and messages to reproduce bugs.
- Inspect the timing and content of every message on every topic using tools like
rqt_bagorros2 bag info. - Isolate failures by correlating anomalous robot behavior with specific sensor readings or internal state changes that occurred seconds or minutes prior.
Algorithm Development and Training Data Collection
ROS Bags provide the raw, time-synchronized multimodal data required to develop and train perception and planning algorithms.
- Perception Models: Record synchronized streams from cameras, LiDAR, and IMUs to train object detection, semantic segmentation, or SLAM models without needing the physical robot present.
- Imitation Learning: Capture topic data (e.g.,
/cmd_vel,/joint_states) during expert demonstrations to create datasets for behavioral cloning. - Benchmarking: Create standard bag files with challenging scenarios (e.g., dynamic obstacles, poor lighting) to serve as a consistent benchmark for evaluating different algorithm versions.
System Integration and Regression Testing
Bags enable continuous integration (CI) pipelines for robotic software by providing reproducible, hardware-independent test scenarios.
- CI/CD Pipelines: A test node subscribes to a bag's playback and validates that new code produces the expected outputs (e.g., correct object detections, planned paths).
- Regression Detection: By comparing the outputs of a new system version against a golden master bag recorded from a known-good version, engineers can detect subtle behavioral regressions.
- Hardware-in-the-Loop (HIL): Bags can feed simulated sensor data to a system partially running on real control hardware, testing the integration of physical actuators with new perception software.
Simulation and Digital Twin Validation
Recorded real-world data is crucial for creating and validating high-fidelity simulations.
- Simulation Ground Truth: A bag from a physical robot provides the ground truth sensor readings and robot states needed to calibrate and validate physics simulators (e.g., Gazebo, Isaac Sim), ensuring the sim's outputs match reality.
- Digital Twin Synchronization: Play back a bag into a simulation to create a synchronized digital twin of a past real-world operation, enabling detailed forensic analysis in a risk-free virtual environment.
- Sim-to-Real Gap Analysis: By running the same algorithm on both a bag (real data) and a simulated replica, engineers can quantify the reality gap and refine their sim-to-Real transfer techniques.
Documentation and Knowledge Sharing
A well-annotated ROS Bag serves as a canonical artifact that captures a specific robotic capability or test scenario.
- Reproducible Research: In academia and industrial R&D, publishing the bag file alongside a paper allows peers to exactly reproduce experimental results and verify claims.
- Team Handoff: A new engineer can understand system behavior by replaying key scenario bags, seeing the exact data flows that experienced developers debugged.
- Demonstration Archives: Bags record successful complex maneuvers (e.g., a door opening, a dynamic obstacle avoidance sequence) for stakeholder reviews, without needing to stage a live demo.
Performance Profiling and System Characterization
Bags allow engineers to analyze the temporal behavior and resource usage of their ROS graph under realistic load.
- Latency Analysis: Tools can process bags to measure end-to-end latency from a sensor message (e.g.,
/camera/image_raw) to a resulting command (e.g.,/cmd_vel). - Network Load Assessment: By analyzing message rates and sizes across topics in a bag, engineers can characterize network bandwidth usage and identify potential bottlenecks before deployment.
- Deterministic Replay for Profiling: Code profilers (e.g.,
ros2_tracing) can be run during a bag replay to isolate CPU-intensive callbacks under a consistent, repeatable data load, unlike variable live tests.
ROS 1 vs. ROS 2 Bag Formats
A technical comparison of the core architectural differences, capabilities, and limitations between the ROS 1 and ROS 2 bag file formats for recording and playing back robotic system data.
| Feature / Metric | ROS 1 Bag Format | ROS 2 Bag Format (MCAP Default) |
|---|---|---|
Primary File Format | Custom binary (.bag) | MCAP (.mcap) or SQLite3 (.db3) |
Underlying Architecture | Custom ROS 1 serialization & TCPROS/UDPROS transport | Pluggable storage backend; MCAP is a container format |
Metadata & Indexing | Basic header; linear index for time | Rich, extensible metadata; efficient chunked indexing |
Data Compression | Per-message (BZ2, LZ4) | Per-chunk (Zstandard, LZ4) and optional per-message |
Concurrent Read/Write | ||
Reliability & Corruption Recovery | Limited; corruption can invalidate entire file | Robust; MCAP's chunked design isolates corruption |
Quality of Service (QoS) Policy Recording | ||
Schema Evolution Support | Limited; requires careful message versioning | Native in MCAP via embedded .msg/.idl schemas |
Performance (Write Speed) | ~80-90% of network rate |
|
Performance (Random Access Seek) | Moderate (linear index) | Fast (chunked index with summary section) |
Interoperability & Tooling | Mature | Growing |
Recommended Use Case | Legacy ROS 1 systems; simple recording/playback | Modern, production-grade systems; data integrity, analysis, and long-term archiving |
Frequently Asked Questions
A ROS Bag is the primary file format for recording and playing back ROS topic data. This FAQ addresses common questions about its purpose, mechanics, and best practices for robotics software engineers.
A ROS Bag is a file format, with the .bag extension, used to record and serialize messages published on ROS Topics for later playback and analysis. It functions as a time-stamped log of a robotic system's sensor data, internal state, and command streams, enabling offline debugging, algorithm development, and system validation without requiring live hardware.
- Core Purpose: Provides deterministic replay of sensor and system data.
- File Format: Uses a custom binary format for efficient storage of serialized ROS Messages.
- Key Tool: The primary command-line utilities are
ros2 bag recordandros2 bag play(or their ROS 1 equivalents,rosbag record/play).
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
ROS Bag files are a core data logging tool, but they exist within a larger ecosystem of communication patterns and data structures that define how a robotic system operates. These related concepts are essential for understanding the context and purpose of ROS Bags.
ROS Topic
A ROS Topic is a named, many-to-many communication channel for asynchronous data streaming. Nodes publish messages to a topic, and any number of subscriber nodes can receive them. This is the primary data source recorded by a ROS Bag.
- Publish-Subscribe Model: Decouples data producers from consumers.
- Message Type: Every topic has a strictly defined
.msgtype (e.g.,sensor_msgs/Image,geometry_msgs/Twist). - Bag Contents: A ROS Bag is essentially a serialized log of messages published on one or more topics, along with their precise timestamps.
ROS Message (.msg)
A ROS Message is a strictly typed data structure that defines the format of information exchanged between nodes. Defined in .msg files, they are the atomic unit of data serialized within a ROS Bag.
- Type Safety: Ensures publishers and subscribers agree on data structure.
- Composition: Messages can contain primitive types (integers, strings) or other nested messages.
- Serialization: When a message is published, it is converted (serialized) into a binary format for transmission and storage in the bag file. Tools like
rosbagdeserialize this data for playback and analysis.
ROS 2 Quality of Service (QoS)
ROS 2 Quality of Service (QoS) policies govern the reliability, durability, and timeliness of communication between nodes. These policies critically affect what data is available to be recorded in a ROS 2 Bag.
- Reliability:
RELIABLEvsBEST_EFFORT. A bag recording withBEST_EFFORTsubscribers may miss messages under system load. - Durability:
VOLATILEvsTRANSIENT_LOCAL. ATRANSIENT_LOCALpublisher retains messages for late-joining subscribers, ensuring a recording node doesn't miss initial data. - History Depth: Determines how many messages are queued before being dropped, impacting recording completeness during processing spikes.
ROS Graph
The ROS Graph is the runtime network of all ROS nodes, topics, services, and actions in a system. A ROS Bag provides a temporal snapshot of the data flowing through a subset of this graph.
- Dynamic Topology: The graph can change as nodes start and stop.
- Visualization: Tools like
rqt_graphvisualize the live graph. - Bag as a Record: Playing back a bag recreates a segment of the graph's data flow in time, allowing developers to analyze the system's state and interactions offline. It is a recorded slice of the graph's activity.
rosbag2 & rviz2
rosbag2 is the ROS 2 suite of tools for bag recording and playback, and rviz2 is the 3D visualization tool. They are the primary applications for interacting with bag data.
- rosbag2 CLI: Commands like
ros2 bag record -o my_bag /camera/image /lidar/scanstart recording.ros2 bag play my_bagreplays data. - SQLite Storage: ROS 2 bags default to an SQLite database format (
.db3), enabling efficient querying and indexing by time and topic. - Visualization: During bag playback, rviz2 can subscribe to re-published topics (e.g., images, point clouds, TF transforms) to visually debug and analyze the recorded sensor data and robot state.
Simulation & Gazebo
High-fidelity physics simulators like Gazebo are used to generate synthetic ROS Bag data in a controlled, repeatable virtual environment before testing on physical hardware.
- Data Generation: Simulated sensors (cameras, IMUs, LiDAR) publish data to ROS topics just like real hardware, which can be recorded into bags.
- Ground Truth: Simulation provides perfect ground truth data (e.g., exact robot pose, object locations) that can be recorded alongside sensor topics for algorithm training and validation.
- Regression Testing: Bags recorded from simulation provide deterministic datasets for continuous integration (CI) testing of perception and planning stacks.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us