Inferensys

Glossary

Apache Iceberg

Apache Iceberg is an open-source table format for managing large, slowly-changing datasets in data lakes, providing ACID transactions, hidden partitioning, and schema evolution.
Data scientist building training data pipeline on laptop, data preprocessing visible, technical workspace.
OPEN TABLE FORMAT

What is Apache Iceberg?

Apache Iceberg is a high-performance, open-source table format designed for managing massive analytic datasets in data lakes, providing data warehouse-like reliability on scalable object storage.

Apache Iceberg is an open-source table format for managing large, slowly-changing datasets in data lakes, providing ACID transactions, hidden partitioning, and schema evolution to address reliability and performance limitations of raw object storage. It functions as a specification layer that organizes files into tables with consistent metadata, enabling engines like Apache Spark, Trino, and Flink to interact with data as if it were in a traditional warehouse.

Its architecture separates the physical data layout from the logical table view, enabling key enterprise features. Time travel allows querying historical snapshots, partition evolution lets you change partition schemes without rewriting data, and snapshot isolation ensures concurrent readers and writers do not conflict. This makes Iceberg a foundational component of the modern data lakehouse architecture, bridging data lakes and warehouses.

TABLE FORMAT ARCHITECTURE

Key Features of Apache Iceberg

Apache Iceberg is an open-source table format that brings data warehouse-like reliability and performance to data lakes. Its core features address the fundamental limitations of managing large-scale analytical data on object storage.

01

ACID Transactions

Apache Iceberg provides ACID (Atomicity, Consistency, Isolation, Durability) transaction guarantees on object storage. This ensures data integrity by making concurrent writes safe and preventing readers from seeing partial, uncommitted data.

  • Atomic commits: Changes to data files and metadata are committed as a single, instantaneous operation.
  • Serializable isolation: Concurrent writers are prevented from creating conflicting table states.
  • Consistent reads: Readers always see a complete, consistent snapshot of the table, even during writes.
02

Hidden Partitioning & Schema Evolution

Iceberg decouples the physical layout of data from its logical representation, enabling powerful evolution capabilities without breaking queries.

  • Hidden partitioning: Queries filter on table data (e.g., WHERE event_date = '2024-01-01'), not directory paths. The table's partitioning scheme can be changed without requiring SQL queries to be rewritten.
  • Safe schema evolution: Columns can be added, dropped, renamed, or have their types updated (e.g., INT to BIGINT) in a backward-compatible way. Existing data files remain valid, and readers using older schemas continue to work.
03

Time Travel & Rollback

Iceberg maintains a full version history of table snapshots, enabling deterministic data auditing and recovery.

  • Time travel: Query the table's state as it existed at any specific point in time or snapshot ID (e.g., SELECT * FROM table VERSION AS OF 1234 or ... TIMESTAMP AS OF '2024-01-01 10:00:00').
  • Rollback: Instantly revert the entire table to a previous, known-good state. This is critical for correcting erroneous batch jobs or recovering from data corruption.
04

Performance Optimizations

The format includes several architectural features designed for high-performance analytics on petabyte-scale datasets.

  • Advanced metadata: Iceberg maintains rich metadata files (manifest lists and manifests) that catalog every data file, its partition values, and column-level statistics (min/max, null counts).
  • Metadata pruning: Query engines use this metadata to skip entire files and partitions that cannot contain relevant data, drastically reducing I/O.
  • Data file pruning: Within files, column-level stats enable further skipping of row groups (in Parquet/ORC).
  • Partition evolution: Partition schemes can be updated to optimize for new query patterns without a costly full data rewrite.
05

Format & Engine Agnosticism

Iceberg is designed as an open standard, independent of specific execution engines or underlying file formats.

  • Multiple engine support: Tables can be created and queried by Apache Spark, Trino, Flink, Apache Hive, and many other compute engines.
  • File format flexibility: Underlying data files are typically stored in efficient columnar formats like Apache Parquet, but Iceberg itself is format-agnostic.
  • Object store native: It is optimized for cloud object stores (S3, ADLS, GCS) but works on HDFS and other systems.
06

Data Lakehouse Foundation

Iceberg is a foundational component of the data lakehouse architecture, merging the best aspects of data lakes and warehouses.

  • Combines strengths: It provides the low-cost, flexible storage of a data lake with the ACID compliance, schema enforcement, and performance of a data warehouse.
  • Unified tier: Serves as a single, reliable source of truth for both batch and streaming data, supporting BI, SQL analytics, and machine learning workloads.
  • Governance-ready: Its immutable snapshot log and rich metadata provide a strong foundation for data lineage, auditability, and governance.
TABLE FORMAT

How Apache Iceberg Works

Apache Iceberg is an open-source table format that structures massive datasets stored in object storage like Amazon S3 or Azure Data Lake Storage, providing a reliable, high-performance abstraction layer for analytical engines.

Apache Iceberg functions as a metadata layer that sits atop files in a data lake, defining tables through a manifest list, manifest files, and data files. This architecture enables ACID transactions and time travel by tracking snapshots of the table's state. Operations like inserts or deletes create new snapshots without rewriting data, ensuring isolation and consistency for concurrent readers and writers. The format's core innovation is decoupling physical data layout from logical query planning.

It provides hidden partitioning and schema evolution, allowing engines to filter data efficiently without directory-based partition discovery and to safely add, rename, or delete columns. Metadata pruning and statistics (like min/max values) at the file level enable fast query planning. Iceberg's design directly addresses the limitations of raw object storage, transforming it into a managed, query-optimized data lakehouse foundation compatible with engines like Spark, Trino, and Flink.

OPEN TABLE FORMAT COMPARISON

Apache Iceberg vs. Delta Lake vs. Hudi

A technical comparison of the three leading open-source table formats designed to bring data warehouse-like reliability and performance to data lakes.

Feature / CapabilityApache IcebergDelta LakeApache Hudi

Primary Maintainer / Origin

Apache Software Foundation (originated at Netflix)

Linux Foundation (originated at Databricks)

Apache Software Foundation (originated at Uber)

Core Storage Abstraction

Table format with separate metadata, data, and manifest files.

Transaction log (JSON/Parquet) stored alongside data files.

Timeline of actions stored in .hoodie directory with data files.

ACID Transaction Guarantees

Hidden Partitioning

Schema Evolution

Add, drop, rename, update, reorder columns.

Add, drop, rename columns (update type with limitations).

Add, drop, rename columns.

Time Travel / Data Versioning

Snapshot-based via manifest lists. Supports branch/tag.

Versioned via transaction log. Direct timestamp/version query.

Snapshot-based via commit timeline. Incremental query support.

Partition Evolution

Data File Format Agnostic

Primarily Parquet, Avro, ORC.

Primarily Parquet.

Primarily Parquet, Avro.

Primary Use Case Focus

Large-scale analytic tables with complex schemas and queries.

Reliable data engineering pipelines and streaming/batch unification.

Fast upserts/change data capture and incremental processing.

Streaming & Batch Unification

Compute Engine Integration

Apache Spark, Trino, Flink, Presto, Hive, Dremio, Snowflake, etc.

Apache Spark, Databricks Runtime, Flink, Presto, Trino, etc.

Apache Spark, Flink, Hive, Presto, Trino, etc.

Performance Optimizations

Advanced planning via manifest files, partition pruning, column stats.

Data skipping via statistics in transaction log, Z-Ordering.

Indexing for upserts (Bloom, HBase, Simple), clustering.

ENTERPRISE DATA ARCHITECTURE

Common Use Cases for Apache Iceberg

Apache Iceberg is a high-performance table format for managing massive analytic datasets in data lakes. Its core features—ACID transactions, hidden partitioning, and schema evolution—solve critical reliability and performance problems inherent to raw object storage.

02

Schema Evolution & Safe Migration

Iceberg supports in-place, non-breaking schema evolution, allowing table schemas to be updated without rewriting data or breaking existing queries. Key operations include:

  • Adding columns: New columns can be added and populated without affecting existing reads.
  • Renaming columns: Columns can be renamed while preserving existing data; Iceberg manages the mapping.
  • Evolving types: Certain type changes (e.g., int to long) are supported safely.
  • Nested field evolution: Adding, removing, or renaming fields within complex structs, maps, and arrays. This eliminates costly, error-prone data migration pipelines and enables agile data product development.
03

Hidden Partitioning & Partition Evolution

Iceberg implements hidden partitioning, where the physical layout is decoupled from the logical table schema. This solves major pain points of Hive-style partitioning:

  • No directory-based filters: Users query by column (e.g., WHERE event_date = '2024-01-01'), not by path. Iceberg automatically applies partition transforms.
  • Partition evolution: The partition scheme of a table can be changed (e.g., from DAY(event_ts) to MONTH(event_ts)) without requiring existing data to be rewritten. New data uses the new scheme while old data remains queryable.
  • Multiple partition transforms: Supports identity, bucket, truncate, year, month, day, and hour transforms on columns.
04

Incremental Processing & Change Data Capture (CDC)

Iceberg's snapshot model enables efficient incremental processing. By tracking snapshots, systems can identify precisely what data has changed between two points in time.

  • INCREMENTAL queries: Use SELECT ... FROM table CHANGES ... syntax to stream only new or modified rows.
  • Downstream pipeline optimization: Downstream ETL, materialized views, or feature stores only process new data, reducing compute costs and latency.
  • Merge-on-read for CDC: Efficiently apply updates from operational databases using MERGE INTO statements, which perform an upsert operation by combining new data with the existing table.
05

Performance Optimization with Data Skipping

Iceberg maintains rich metadata—including manifest files with column-level statistics (min/max values, null counts)—enabling highly efficient data skipping during query planning.

  • Metadata filtering: Query engines (Spark, Trino, Flink) read the metadata first to prune files that cannot contain relevant data, drastically reducing I/O.
  • Automatic compaction: Small files created by streaming writes can be automatically compacted into larger files (using rewrite_data_files) to maintain optimal read performance.
  • Sorting and Z-ordering: Data can be physically ordered within files using Z-order on multiple columns (e.g., user_id, event_date), co-locating related data and maximizing the effectiveness of data skipping.
06

Multi-Modal Data Lake Foundation

Iceberg serves as a unified table layer for heterogeneous data workloads, making it a cornerstone for data lakehouse architectures.

  • Unified Batch & Streaming: Serves as both the source and sink for batch (Spark) and streaming (Flink, Kafka Connect) jobs with full ACID guarantees.
  • Multi-engine consistency: Tables can be written by Spark and immediately queried by Trino, Presto, or Snowflake without inconsistency, thanks to a standardized, open metadata format.
  • Foundation for ML/Feature Stores: Provides reliable, versioned, and efficiently queryable storage for feature data, acting as a robust backend for feature stores. Its time-travel capability is essential for point-in-time correctness in model training.
APACHE ICEBERG

Frequently Asked Questions

Apache Iceberg is a foundational technology for modern data lakehouses. These questions address its core mechanics, benefits, and how it compares to related technologies.

Apache Iceberg is an open-source, high-performance table format for managing massive analytic tables on scalable object storage like Amazon S3 or Azure Blob Storage. It works by adding a structured metadata layer on top of raw data files (e.g., Parquet, ORC) that tracks the table's complete state, enabling features like ACID transactions, time travel, and schema evolution without locking data. The architecture consists of:

  • Metadata Files: A catalog pointing to the current "snapshot" of the table.
  • Manifest Lists: Files that list manifests for a given snapshot.
  • Manifest Files: Files that list data files with partition and column-level statistics.
  • Data Files: The actual Parquet/ORC/Avro files containing the table's data. When a query runs, the engine reads the metadata to precisely identify which data files are relevant, enabling efficient partition pruning and file skipping even for complex queries.
Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.