Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

The lakehouse architecture has matured from concept to production standard in 2025–2026. But as data teams consolidate batch and streaming workloads, they face a pivotal choice: Apache Iceberg vs Apache Paimon. Both are open-source table formats designed for ACID compliance, schema evolution, and time-travel queries—but they differ fundamentally in heritage, write strategy, and optimal use cases.

Architecture at a glance

Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026) — architecture diagram — Architecture diagram — Apache Iceberg vs Paimon: Lakehouse Table Formats Compared (2026)

This post compares Iceberg and Paimon across architecture, streaming primitives, catalog integration, and operational complexity. By the end, you’ll have a decision matrix to choose the right format for your lakehouse.

Origins: Netflix’s Batch Revolution vs Flink’s Streaming Inheritance

Apache Iceberg emerged from Netflix’s 2018 effort to solve the metadata scalability problem in data lakes. Netflix was running petabyte-scale Hadoop clusters with Hive tables and faced unbounded metadata blobs that made snapshots and time-travel impractical. Iceberg introduced versioned metadata as first-class citizens, decoupling table snapshots from the data files themselves.

Apache Paimon (formerly Flink Table Store, open-sourced in 2023) took a different path. It was born from the Apache Flink community’s observation that Change Data Capture (CDC) and real-time streaming demanded a table format with built-in compaction primitives, dimension lookup, and sub-second write latency. Paimon started as a streaming-first design where LSM-trees and multi-level compaction were core, not add-ons.

This genealogy shapes everything: Iceberg optimizes for batch query engines first, streaming second. Paimon optimizes for Flink ingest first, analytics second.

Architecture: Manifest Hierarchies vs LSM-Trees

Iceberg: Manifest-Based Snapshots

Iceberg stores table metadata in a manifest hierarchy:

Table metadata points to the latest snapshot ID
Each snapshot references a manifest list (partition spec + file metadata)
The manifest list enumerates manifest files per partition
Manifest files contain entries for data files (path, row count, metrics)

The key insight: metadata is immutable and versionable. Every snapshot is a complete view of the table at a point in time, enabling cheap time-travel and concurrent reads without locks.

See arch_01.mmd for the hierarchy:
– Table → Snapshot → Manifest List → Manifest Files → Data Files

Paimon: Per-Partition LSM-Trees

Paimon uses a Log-Structured Merge-tree (LSM-tree) per partition, similar to RocksDB:

Level 0: Unsorted in-memory and spillable runs (100–200 MB)
Level 1+: Sorted runs with exponential size growth
Compaction merges runs when level thresholds are exceeded
Dimension tables use a dedicated snapshot-isolated LSM for fast lookups

The LSM design was chosen because:
– Sub-second write latency: writes hit Level 0 and return immediately
– Background compaction: doesn’t block ingest
– Efficient CDC: can capture deltas between compaction levels
– Natural dimension indexing: sorted runs enable fast point lookups

See arch_02.mmd for the LSM-tree per partition, with compaction triggers.

Write Modes: CoW vs MoR, and Defaults Matter

Both formats support Copy-on-Write (CoW) and Merge-on-Read (MoR), but they diverge in philosophy:

Aspect	Iceberg	Paimon
Default mode	Copy-on-Write (CoW)	Merge-on-Read (MoR)
CoW cost	All rows rewritten to new file	Manifest + LSM compact only
MoR cost	V2 requires changelog tracking	Native; no extra bookkeeping
MoR latency	V2 can be slow for large deltas	Sub-second; LSM handles it
Delete handling	Position deletes (V2) tracked separately	Integrated in LSM

Iceberg CoW is simpler for Spark and Trino, but rewrites entire data files even for single-row updates. Iceberg V2 MoR was added in 2023 to avoid rewrites, but requires changelog tracking and extra metadata.

Paimon MoR is baked into the design: updates hit Level 0 immediately, background compaction merges them later. No changelog overhead.

Operational reality: If you’re running sub-minute CDC ingests into Iceberg, you’ll want V2 MoR; if you’re running hourly batch upserts, CoW is fine. Paimon’s MoR is always there, making streaming feel natural.

Streaming-First: Paimon’s CDC Advantage

Paimon has a native CDC connector and dimension-table logic that Iceberg added only in V3 (2025):

Paimon CDC Path

Flink CDC operator captures MySQL binlog / Kafka topics
Flink Paimon sink writes directly to LSM Level 0
Background compaction merges writes asynchronously
Dimension lookups use snapshot-isolated LSM reads

Iceberg CDC Path (V3)

Flink or Debezium buffers CDC events
Micro-batching or upserts append to Iceberg
ChangelogScan (new in V3) reads change deltas without full snapshots
No native dimension logic; you build it with Flink side-inputs

See arch_03.mmd for the ingest paths side-by-side.

Latency comparison (typical):
– Paimon CDC: 100–500 ms end-to-end (MySQL bin log → query result)
– Iceberg CDC: 1–5 seconds (micro-batch + snapshot interval)

Paimon wins for real-time operational tables (orders, inventory, customer state). Iceberg wins for analytics on immutable events (clickstream, logs).

Catalogs: REST, Polaris, Nessie, and Hive

Table formats need a catalog to store metadata pointers:

Catalog	Iceberg	Paimon	Use Case
REST API	Full support	Beta support	Warehouse-agnostic, distributed
Polaris	Native (by Databricks)	Not yet	Managed service, SOC2/FedRAMP
Nessie	Full support	Not yet	Git-like branching, time-travel
Unity Catalog	Databricks managed	Not yet	Databricks Lakehouse
Hive Metastore	Supported	Primary	Open-source, widely deployed

See arch_04.mmd for catalog ecosystem and compute engine integrations.

Iceberg’s catalog advantage: multiple options allow organizations to avoid vendor lock-in. Polaris (Databricks’ open REST catalog) and Nessie (branching model for data versioning) give Iceberg users governance flexibility.

Paimon’s catalog limitation: Hive Metastore is the primary production catalog. REST support is beta. This means:
– Existing Hive deployments integrate easily
– But Paimon lacks the multi-branch versioning story Nessie provides
– Governance for Paimon is still evolving

Compute Integration: Spark, Trino, Flink, and Specialized Engines

Both formats work across multiple engines, but with different degrees of maturity:

Iceberg’s Broad Ecosystem

Apache Spark: Read + Write (CoW + V2 MoR)
Trino / Presto: Read + Write
Snowflake: Native read (managed Iceberg)
Amazon Athena: Native support (AWS-managed)
Google BigQuery: Read support (via REST)
DuckDB: Full support (analytical SQL)

Paimon’s Focused Ecosystem

Apache Flink: Native read + write (streaming + batch)
Apache Spark: Read + Write (via REST)
Trino: Read (via REST catalog beta)
StarRocks: Native read (fast OLAP queries)

Takeaway: Iceberg is the de facto standard for SQL warehouses and cloud-native analytics. Paimon is strongest in Flink-centric organizations and real-time StarRocks deployments.

Schema Evolution: Both Safe, Both ACID

Both formats handle schema changes correctly:

Add column: Safe, default value applied to existing rows
Drop column: Safe, column metadata removed
Rename column: Safe, metadata updated
Change type: Validated (e.g., int → long allowed; string → int not)

Key difference: Iceberg schema versioning is explicit in metadata; Paimon’s LSM handles it implicitly. Operationally, both are equally safe.

Operational Complexity: Manifest Cleanup vs Compaction Management

Iceberg Operations

Metadata cleanup: Old snapshots accumulate; must use expire_snapshots() to garbage-collect
Orphaned files: Failed writes may leave dangling files; remove_orphan_files() required
No background compaction: Manual file consolidation via rewrite_data_files() if too many small files

Effort: Low if you automate snapshot expiration. High if you ignore orphaned files (they pile up and balloon your storage).

Paimon Operations

Automatic compaction: Background job continuously merges LSM levels
Compaction tuning: Configure Level 0 size threshold, max level count
Compaction latency: May delay recent writes while merging

Effort: Medium. You must tune compaction (L0 size, write parallelism) for your ingest rate. Too aggressive → high CPU; too lenient → bloated L0.

Hybrid teams: Iceberg + automatic snapshot expiration is simpler; Paimon requires more hands-on tuning.

CDC Primitives: Paimon’s Dedicated Toolkit

Paimon’s CDC design includes:

CDC connector: Kafka source directly maps to Paimon updates
Dimension table mode: Lookup table for joins (e.g., user profiles)
Changelog reads: Get only delta between timestamps, no full table scan
Delete semantics: Integrated; no separate position-delete bookkeeping

Iceberg V3’s ChangelogScan added similar capabilities in 2025:
– Read change deltas without full snapshots
– Works with position deletes and append-only logs

But Iceberg’s CDC tooling is younger and less battle-tested in production than Paimon’s Flink-native CDC.

Decision Matrix: When to Pick Which

See arch_05.mmd for the decision tree. Here’s the distilled logic:

Scenario	Recommendation	Rationale
Batch ETL daily/hourly, Spark + Trino	Iceberg	Mature ecosystem, simple CoW, no compaction tuning
Sub-minute CDC ingest, Flink primary	Paimon	Native LSM, background compaction, CDC operators ready
Snowflake / BigQuery primary	Iceberg	Warehouse-native support; Iceberg is managed
Real-time dimension tables, lookup joins	Paimon	Dimension mode, no separate side-input logic
Multi-branch versioning needed (Git-like)	Iceberg + Nessie	Only solution; Paimon lacks branching
Open-source, self-managed, all OSS engines	Iceberg	More catalog flexibility; REST + Nessie options
StarRocks OLAP, real-time analytics	Paimon	StarRocks native integration, fast ingests
Hybrid: write streaming, read batch analytics	Paimon ingest → Iceberg REST	Paimon handles CDC, Iceberg REST catalog abstracts reads

Hybrid Architectures: Paimon + Iceberg

A practical 2026 pattern is write-side and read-side specialization:

Ingest tier: Apache Paimon + Flink CDC
– Capture MySQL / Kafka changes into Paimon LSM
– Background compaction keeps operational tables fresh
– Sub-second latency for operational reads
Analytics tier: Iceberg snapshot export
– Paimon periodic snapshots exported to Iceberg REST catalog
– Iceberg snapshots consumed by Spark, Trino, Snowflake
– Time-travel and versioning at query tier
Catalog: Shared REST catalog (Nessie or Polaris)
– Both formats register with same catalog
– Unified data discovery and governance

This pattern leverages both:
– Paimon’s streaming strength (low-latency ingest)
– Iceberg’s analytics strength (broad SQL engine support, versioning)

2026 Maturity Assessment

Apache Iceberg

Maturity: Production-ready (v1 since 2019)
Ecosystem: Snowflake, Databricks, AWS, Google invested
Risk: Highest adoption; most hiring knowledge available
Bleeding edge: V3 ChangelogScan, Puffin stats (2025)

Apache Paimon

Maturity: Production-ready (v0.4+, gained traction in 2024–2025)
Ecosystem: Alibaba, Tencent, ByteDance deployments; growing Flink integrations
Risk: Smaller ecosystem; Hive Metastore primary catalog (REST beta)
Bleeding edge: Cross-partition compaction, dynamic bucketing (2025)

Performance Benchmarks: Streaming Latency vs Batch Throughput

Workload	Iceberg (CoW)	Iceberg (V2 MoR)	Paimon
1M row batch insert	15s (rewrites)	3s (changelog)	1s (LSM)
Single-row update latency	15s (CoW)	1–2s (MoR)	100–500ms (LSM)
1M row scan	8s	8s	10s (L0 overhead)
CDC 100 events/sec	5–10s batch latency	500ms	100ms

Reality check: Benchmarks vary by storage (HDFS, S3, local SSD), serialization (Parquet, ORC), and compute engine. These are representative; test your workload.

Common Pitfalls

Iceberg Pitfalls

Orphaned files: Set up snapshot expiration or risk unbounded storage
Manifest explosion: Too many snapshots → slow metadata reads; use rewrite_manifests()
V2 MoR immaturity: Smaller test surface; avoid if CoW sufficient

Paimon Pitfalls

Under-compacted L0: If ingest rate exceeds compaction, L0 bloats and slows reads
Limited catalog options: Hive Metastore constraints (no branching, no fine-grained ACLs)
Trino integration beta: REST catalog support still maturing

Migration Path: From Hive / Delta Lake

To Iceberg

Use spark-sql or Scala API to migrate Hive tables: CALL migrate_table('hive_db.table_name')
Iceberg handles partitioning, stats, and schema automatically
Existing Spark jobs work without code changes (data source swaps to org.apache.iceberg.spark)

To Paimon

Use flink sql CLI or DataStream API
Requires Flink 1.17+
CDC connector makes streaming migration straightforward; batch migration requires manual Spark job
Hive Metastore integration smooth

FAQ: Five PAA Questions

1. Can we use both Iceberg and Paimon in the same data platform?

Yes. Many platforms use Iceberg for batch analytics and Paimon for streaming ingest. Use a shared REST catalog (Nessie or Polaris) so both formats appear as one logical data warehouse. Tradeoff: dual-format support adds operational complexity (two metadata systems, two compaction strategies).

2. Does Paimon’s LSM compaction require tuning every time we scale ingest?

Often. Compaction speed depends on hardware (CPU cores, disk I/O) and LSM configuration. As ingest rate grows, Level 0 size or compaction concurrency may need adjustment. Iceberg avoids this by having no background compaction, but pays the cost in slow accumulation of small files. Start conservative (small L0 thresholds) and loosen as you tune.

3. Is Iceberg’s V2 Merge-on-Read stable enough for production in 2026?

Mostly, but with caveats. Iceberg V2 MoR works well in Spark and DuckDB; Trino support is newer. If you need V2 MoR in many engines, prioritize Iceberg V1 CoW or Paimon LSM instead. Ask your engine vendor (Databricks, Starburst, etc.) for production guarantees.

4. How does Paimon handle time-travel if compaction merges files?

Via snapshot IDs. Paimon retains LSM level structure in snapshot metadata, so you can query as-of a snapshot even after files are compacted. Similar to Iceberg: metadata is immutable, files are opaque.

5. What’s the cost difference: Iceberg vs Paimon over a year?

In cloud storage (S3/GCS):
– Iceberg + lazy cleanup: Small files accumulate; expect 10–20% storage waste without active orphan cleanup
– Paimon + steady compaction: LSM keeps file count lower; expect 5–10% waste if you tune compaction

In compute:
– Iceberg: Minimal overhead; metadata reads are fast
– Paimon: Compaction jobs consume CPU continuously; budget for background workers

For a 10 TB daily ingest, Iceberg might cost 5–10% more in storage (orphaned files) but save on compute. Paimon balances both but requires operational attention. Net: similar TCO; pick based on workload fit, not cost.

Conclusion

Apache Iceberg vs Paimon is not a binary choice in 2026. Both are production-grade open-source table formats with distinct strengths:

Choose Iceberg if you are batch-first (Spark/Trino/Snowflake), value ecosystem breadth, or need multi-branch versioning
Choose Paimon if you are streaming-first (Flink CDC), need sub-second write latency, or prioritize background compaction
Combine both if you have distinct streaming and analytics tiers, using a shared REST catalog

The lakehouse architecture has matured to the point where table format choice should be driven by workload fit, not hype. Understand your ingest patterns, query engines, and operational constraints—then pick the format (or formats) that align.

Last Updated: 2026-04-29

This post is part of the IoT Digital Twin PLM content series on cloud data platforms and lakehouse architectures.