Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

The lakehouse architecture has matured from concept to production standard in 2025–2026. But as data teams consolidate batch and streaming workloads, they face a pivotal choice: Apache Iceberg vs Apache Paimon. Both are open-source table formats designed for ACID compliance, schema evolution, and time-travel queries—but they differ fundamentally in heritage, write strategy, and optimal use cases.

Architecture at a glance

Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram
Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)
Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram
Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)
Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram
Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)
Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram
Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)
Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram
Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

This post compares Iceberg and Paimon across architecture, streaming primitives, catalog integration, and operational complexity. By the end, you’ll have a decision matrix to choose the right format for your lakehouse.


Apache Iceberg emerged from Netflix’s 2018 effort to solve the metadata scalability problem in data lakes. Netflix was running petabyte-scale Hadoop clusters with Hive tables and faced unbounded metadata blobs that made snapshots and time-travel impractical. Iceberg introduced versioned metadata as first-class citizens, decoupling table snapshots from the data files themselves.

Apache Paimon (formerly Flink Table Store, open-sourced in 2023) took a different path. It was born from the Apache Flink community’s observation that Change Data Capture (CDC) and real-time streaming demanded a table format with built-in compaction primitives, dimension lookup, and sub-second write latency. Paimon started as a streaming-first design where LSM-trees and multi-level compaction were core, not add-ons.

This genealogy shapes everything: Iceberg optimizes for batch query engines first, streaming second. Paimon optimizes for Flink ingest first, analytics second.


Architecture: Manifest Hierarchies vs LSM-Trees

Iceberg: Manifest-Based Snapshots

Iceberg stores table metadata in a manifest hierarchy:

  1. Table metadata points to the latest snapshot ID
  2. Each snapshot references a manifest list (partition spec + file metadata)
  3. The manifest list enumerates manifest files per partition
  4. Manifest files contain entries for data files (path, row count, metrics)

The key insight: metadata is immutable and versionable. Every snapshot is a complete view of the table at a point in time, enabling cheap time-travel and concurrent reads without locks.

See arch_01.mmd for the hierarchy:
– Table → Snapshot → Manifest List → Manifest Files → Data Files

Paimon: Per-Partition LSM-Trees

Paimon uses a Log-Structured Merge-tree (LSM-tree) per partition, similar to RocksDB:

  1. Level 0: Unsorted in-memory and spillable runs (100–200 MB)
  2. Level 1+: Sorted runs with exponential size growth
  3. Compaction merges runs when level thresholds are exceeded
  4. Dimension tables use a dedicated snapshot-isolated LSM for fast lookups

The LSM design was chosen because:
Sub-second write latency: writes hit Level 0 and return immediately
Background compaction: doesn’t block ingest
Efficient CDC: can capture deltas between compaction levels
Natural dimension indexing: sorted runs enable fast point lookups

See arch_02.mmd for the LSM-tree per partition, with compaction triggers.


Write Modes: CoW vs MoR, and Defaults Matter

Both formats support Copy-on-Write (CoW) and Merge-on-Read (MoR), but they diverge in philosophy:

Aspect Iceberg Paimon
Default mode Copy-on-Write (CoW) Merge-on-Read (MoR)
CoW cost All rows rewritten to new file Manifest + LSM compact only
MoR cost V2 requires changelog tracking Native; no extra bookkeeping
MoR latency V2 can be slow for large deltas Sub-second; LSM handles it
Delete handling Position deletes (V2) tracked separately Integrated in LSM

Iceberg CoW is simpler for Spark and Trino, but rewrites entire data files even for single-row updates. Iceberg V2 MoR was added in 2023 to avoid rewrites, but requires changelog tracking and extra metadata.

Paimon MoR is baked into the design: updates hit Level 0 immediately, background compaction merges them later. No changelog overhead.

Operational reality: If you’re running sub-minute CDC ingests into Iceberg, you’ll want V2 MoR; if you’re running hourly batch upserts, CoW is fine. Paimon’s MoR is always there, making streaming feel natural.


Streaming-First: Paimon’s CDC Advantage

Paimon has a native CDC connector and dimension-table logic that Iceberg added only in V3 (2025):

Paimon CDC Path

  1. Flink CDC operator captures MySQL binlog / Kafka topics
  2. Flink Paimon sink writes directly to LSM Level 0
  3. Background compaction merges writes asynchronously
  4. Dimension lookups use snapshot-isolated LSM reads

Iceberg CDC Path (V3)

  1. Flink or Debezium buffers CDC events
  2. Micro-batching or upserts append to Iceberg
  3. ChangelogScan (new in V3) reads change deltas without full snapshots
  4. No native dimension logic; you build it with Flink side-inputs

See arch_03.mmd for the ingest paths side-by-side.

Latency comparison (typical):
Paimon CDC: 100–500 ms end-to-end (MySQL bin log → query result)
Iceberg CDC: 1–5 seconds (micro-batch + snapshot interval)

Paimon wins for real-time operational tables (orders, inventory, customer state). Iceberg wins for analytics on immutable events (clickstream, logs).


Catalogs: REST, Polaris, Nessie, and Hive

Table formats need a catalog to store metadata pointers:

Catalog Iceberg Paimon Use Case
REST API Full support Beta support Warehouse-agnostic, distributed
Polaris Native (by Databricks) Not yet Managed service, SOC2/FedRAMP
Nessie Full support Not yet Git-like branching, time-travel
Unity Catalog Databricks managed Not yet Databricks Lakehouse
Hive Metastore Supported Primary Open-source, widely deployed

See arch_04.mmd for catalog ecosystem and compute engine integrations.

Iceberg’s catalog advantage: multiple options allow organizations to avoid vendor lock-in. Polaris (Databricks’ open REST catalog) and Nessie (branching model for data versioning) give Iceberg users governance flexibility.

Paimon’s catalog limitation: Hive Metastore is the primary production catalog. REST support is beta. This means:
– Existing Hive deployments integrate easily
– But Paimon lacks the multi-branch versioning story Nessie provides
– Governance for Paimon is still evolving


Both formats work across multiple engines, but with different degrees of maturity:

Iceberg’s Broad Ecosystem

  • Apache Spark: Read + Write (CoW + V2 MoR)
  • Trino / Presto: Read + Write
  • Snowflake: Native read (managed Iceberg)
  • Amazon Athena: Native support (AWS-managed)
  • Google BigQuery: Read support (via REST)
  • DuckDB: Full support (analytical SQL)

Paimon’s Focused Ecosystem

  • Apache Flink: Native read + write (streaming + batch)
  • Apache Spark: Read + Write (via REST)
  • Trino: Read (via REST catalog beta)
  • StarRocks: Native read (fast OLAP queries)

Takeaway: Iceberg is the de facto standard for SQL warehouses and cloud-native analytics. Paimon is strongest in Flink-centric organizations and real-time StarRocks deployments.


Schema Evolution: Both Safe, Both ACID

Both formats handle schema changes correctly:

  • Add column: Safe, default value applied to existing rows
  • Drop column: Safe, column metadata removed
  • Rename column: Safe, metadata updated
  • Change type: Validated (e.g., int → long allowed; string → int not)

Key difference: Iceberg schema versioning is explicit in metadata; Paimon’s LSM handles it implicitly. Operationally, both are equally safe.


Operational Complexity: Manifest Cleanup vs Compaction Management

Iceberg Operations

  1. Metadata cleanup: Old snapshots accumulate; must use expire_snapshots() to garbage-collect
  2. Orphaned files: Failed writes may leave dangling files; remove_orphan_files() required
  3. No background compaction: Manual file consolidation via rewrite_data_files() if too many small files

Effort: Low if you automate snapshot expiration. High if you ignore orphaned files (they pile up and balloon your storage).

Paimon Operations

  1. Automatic compaction: Background job continuously merges LSM levels
  2. Compaction tuning: Configure Level 0 size threshold, max level count
  3. Compaction latency: May delay recent writes while merging

Effort: Medium. You must tune compaction (L0 size, write parallelism) for your ingest rate. Too aggressive → high CPU; too lenient → bloated L0.

Hybrid teams: Iceberg + automatic snapshot expiration is simpler; Paimon requires more hands-on tuning.


CDC Primitives: Paimon’s Dedicated Toolkit

Paimon’s CDC design includes:

  1. CDC connector: Kafka source directly maps to Paimon updates
  2. Dimension table mode: Lookup table for joins (e.g., user profiles)
  3. Changelog reads: Get only delta between timestamps, no full table scan
  4. Delete semantics: Integrated; no separate position-delete bookkeeping

Iceberg V3’s ChangelogScan added similar capabilities in 2025:
– Read change deltas without full snapshots
– Works with position deletes and append-only logs

But Iceberg’s CDC tooling is younger and less battle-tested in production than Paimon’s Flink-native CDC.


Decision Matrix: When to Pick Which

See arch_05.mmd for the decision tree. Here’s the distilled logic:

Scenario Recommendation Rationale
Batch ETL daily/hourly, Spark + Trino Iceberg Mature ecosystem, simple CoW, no compaction tuning
Sub-minute CDC ingest, Flink primary Paimon Native LSM, background compaction, CDC operators ready
Snowflake / BigQuery primary Iceberg Warehouse-native support; Iceberg is managed
Real-time dimension tables, lookup joins Paimon Dimension mode, no separate side-input logic
Multi-branch versioning needed (Git-like) Iceberg + Nessie Only solution; Paimon lacks branching
Open-source, self-managed, all OSS engines Iceberg More catalog flexibility; REST + Nessie options
StarRocks OLAP, real-time analytics Paimon StarRocks native integration, fast ingests
Hybrid: write streaming, read batch analytics Paimon ingest → Iceberg REST Paimon handles CDC, Iceberg REST catalog abstracts reads

Hybrid Architectures: Paimon + Iceberg

A practical 2026 pattern is write-side and read-side specialization:

  1. Ingest tier: Apache Paimon + Flink CDC
    – Capture MySQL / Kafka changes into Paimon LSM
    – Background compaction keeps operational tables fresh
    – Sub-second latency for operational reads

  2. Analytics tier: Iceberg snapshot export
    – Paimon periodic snapshots exported to Iceberg REST catalog
    – Iceberg snapshots consumed by Spark, Trino, Snowflake
    – Time-travel and versioning at query tier

  3. Catalog: Shared REST catalog (Nessie or Polaris)
    – Both formats register with same catalog
    – Unified data discovery and governance

This pattern leverages both:
– Paimon’s streaming strength (low-latency ingest)
– Iceberg’s analytics strength (broad SQL engine support, versioning)


2026 Maturity Assessment

Apache Iceberg

  • Maturity: Production-ready (v1 since 2019)
  • Ecosystem: Snowflake, Databricks, AWS, Google invested
  • Risk: Highest adoption; most hiring knowledge available
  • Bleeding edge: V3 ChangelogScan, Puffin stats (2025)

Apache Paimon

  • Maturity: Production-ready (v0.4+, gained traction in 2024–2025)
  • Ecosystem: Alibaba, Tencent, ByteDance deployments; growing Flink integrations
  • Risk: Smaller ecosystem; Hive Metastore primary catalog (REST beta)
  • Bleeding edge: Cross-partition compaction, dynamic bucketing (2025)

Performance Benchmarks: Streaming Latency vs Batch Throughput

Workload Iceberg (CoW) Iceberg (V2 MoR) Paimon
1M row batch insert 15s (rewrites) 3s (changelog) 1s (LSM)
Single-row update latency 15s (CoW) 1–2s (MoR) 100–500ms (LSM)
1M row scan 8s 8s 10s (L0 overhead)
CDC 100 events/sec 5–10s batch latency 500ms 100ms

Reality check: Benchmarks vary by storage (HDFS, S3, local SSD), serialization (Parquet, ORC), and compute engine. These are representative; test your workload.


Common Pitfalls

Iceberg Pitfalls

  1. Orphaned files: Set up snapshot expiration or risk unbounded storage
  2. Manifest explosion: Too many snapshots → slow metadata reads; use rewrite_manifests()
  3. V2 MoR immaturity: Smaller test surface; avoid if CoW sufficient

Paimon Pitfalls

  1. Under-compacted L0: If ingest rate exceeds compaction, L0 bloats and slows reads
  2. Limited catalog options: Hive Metastore constraints (no branching, no fine-grained ACLs)
  3. Trino integration beta: REST catalog support still maturing

Migration Path: From Hive / Delta Lake

To Iceberg

  1. Use spark-sql or Scala API to migrate Hive tables: CALL migrate_table('hive_db.table_name')
  2. Iceberg handles partitioning, stats, and schema automatically
  3. Existing Spark jobs work without code changes (data source swaps to org.apache.iceberg.spark)

To Paimon

  1. Use flink sql CLI or DataStream API
  2. Requires Flink 1.17+
  3. CDC connector makes streaming migration straightforward; batch migration requires manual Spark job
  4. Hive Metastore integration smooth

FAQ: Five PAA Questions

1. Can we use both Iceberg and Paimon in the same data platform?

Yes. Many platforms use Iceberg for batch analytics and Paimon for streaming ingest. Use a shared REST catalog (Nessie or Polaris) so both formats appear as one logical data warehouse. Tradeoff: dual-format support adds operational complexity (two metadata systems, two compaction strategies).

2. Does Paimon’s LSM compaction require tuning every time we scale ingest?

Often. Compaction speed depends on hardware (CPU cores, disk I/O) and LSM configuration. As ingest rate grows, Level 0 size or compaction concurrency may need adjustment. Iceberg avoids this by having no background compaction, but pays the cost in slow accumulation of small files. Start conservative (small L0 thresholds) and loosen as you tune.

3. Is Iceberg’s V2 Merge-on-Read stable enough for production in 2026?

Mostly, but with caveats. Iceberg V2 MoR works well in Spark and DuckDB; Trino support is newer. If you need V2 MoR in many engines, prioritize Iceberg V1 CoW or Paimon LSM instead. Ask your engine vendor (Databricks, Starburst, etc.) for production guarantees.

4. How does Paimon handle time-travel if compaction merges files?

Via snapshot IDs. Paimon retains LSM level structure in snapshot metadata, so you can query as-of a snapshot even after files are compacted. Similar to Iceberg: metadata is immutable, files are opaque.

5. What’s the cost difference: Iceberg vs Paimon over a year?

In cloud storage (S3/GCS):
Iceberg + lazy cleanup: Small files accumulate; expect 10–20% storage waste without active orphan cleanup
Paimon + steady compaction: LSM keeps file count lower; expect 5–10% waste if you tune compaction

In compute:
Iceberg: Minimal overhead; metadata reads are fast
Paimon: Compaction jobs consume CPU continuously; budget for background workers

For a 10 TB daily ingest, Iceberg might cost 5–10% more in storage (orphaned files) but save on compute. Paimon balances both but requires operational attention. Net: similar TCO; pick based on workload fit, not cost.


Conclusion

Apache Iceberg vs Paimon is not a binary choice in 2026. Both are production-grade open-source table formats with distinct strengths:

  • Choose Iceberg if you are batch-first (Spark/Trino/Snowflake), value ecosystem breadth, or need multi-branch versioning
  • Choose Paimon if you are streaming-first (Flink CDC), need sub-second write latency, or prioritize background compaction
  • Combine both if you have distinct streaming and analytics tiers, using a shared REST catalog

The lakehouse architecture has matured to the point where table format choice should be driven by workload fit, not hype. Understand your ingest patterns, query engines, and operational constraints—then pick the format (or formats) that align.

Related reads:
Iceberg vs Delta vs Hudi: Lakehouse Table Formats Compared (2026)
Iceberg Catalogs: Polaris vs Nessie vs Unity Comparison (2026)
Flink vs Spark Streaming vs Kafka Streams: Real-Time Processing (2026)


Last Updated: 2026-04-29

This post is part of the IoT Digital Twin PLM content series on cloud data platforms and lakehouse architectures.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *