Apache Iceberg vs Paimon: Lakehouse Table Formats 2026
Two open table formats now dominate lakehouse roadmaps, and choosing wrong locks you into a rewrite. The apache iceberg vs paimon decision is not a popularity contest between two interchangeable formats — it is a choice between two fundamentally different storage engines wearing the same “open table format” label. Iceberg organizes immutable data files under a snapshot-and-manifest metadata tree built for large analytical tables and slow-moving change. Paimon embeds a Log-Structured Merge-tree (LSM-tree) under the table so it can absorb a firehose of streaming upserts and emit a clean change stream. Pick Iceberg and fight it on second-latency CDC; pick Paimon and discover its batch-scan ergonomics are younger. This post explains the mechanics that drive that trade-off, with current 2026 versions, so you choose once.
What this covers: metadata layout, compaction, primary-key tables and changelog producers, catalog and engine support, streaming versus batch fit, and a decision matrix.
Context and Background
The “lakehouse” idea — ACID transactions, schema evolution, and time travel over cheap object storage — went mainstream because three open table formats made it real: Apache Iceberg, Delta Lake, and Apache Hudi. Iceberg won the neutrality war. Its specification is engine-agnostic, governed by the Apache Software Foundation, and now read or written by Spark, Trino, Flink, Snowflake, Databricks, BigQuery, Amazon EMR, Athena, and Dremio. The Iceberg v3 specification reached general availability across major vendors in May 2026, adding deletion vectors, row lineage for native change data capture (CDC), a VARIANT type, default column values, and nanosecond timestamps.
Apache Paimon entered from a different doorway. It began as Flink Table Store, an Apache Flink subproject solving a problem Iceberg handled awkwardly: continuously ingesting database changelogs and streaming upserts into the lake at low latency. Graduating to a top-level Apache project as Paimon, it kept its defining bet — an LSM-tree under every primary-key table — and added broad engine reads. By 2026 Paimon ships 1.x releases with Iceberg-compatible snapshot output (1.2.0+), a VARIANT type, and Lance integration for multimodal AI data.
The result is two projects converging on the same promise from opposite engineering foundations. Understanding the lakehouse table formats debate means understanding those foundations, not the feature checklists. A useful mental model: Iceberg is a versioned filesystem-of-record with transactional metadata, where the table is the set of files a snapshot declares valid. Paimon is a storage engine that happens to persist to object storage, where the table is the merged result of an ever-running compaction. Those are different abstractions, and they leak into everything from latency to operational tooling.
It also matters why each project exists. Iceberg was born at Netflix to fix the correctness and scale failures of Hive tables — partition-directory listing, no atomic commits, no safe schema evolution — for enormous analytical datasets. Paimon was born inside Flink to fix a different failure: getting a continuous stream of mutations into a queryable lake table without rebuilding it. Neither was designed for the other’s problem, and the 2026 feature convergence does not change the gravity of those origins. For the streaming engines that feed both, see our Flink vs Spark Streaming vs Kafka Streams comparison. For background on the format wars generally, the Apache Iceberg specification is the authoritative primary source.
A third lens on the same divide is who commits. In Iceberg, a small number of writers issue a small number of large, transactional commits, and the metadata model is optimized for making each of those commits atomic, conflict-checked, and cheaply rolled back. In Paimon, a continuous writer issues an unbounded stream of tiny writes and the system never expects a clean quiescent moment — compaction runs forever in the background. That difference in commit cadence is the cleanest predictor of which format a workload will feel comfortable in: a few big commits an hour is Iceberg’s home turf; thousands of tiny mutations a second is Paimon’s. Hold that cadence question in mind through everything that follows, because nearly every concrete behavior below traces back to it.
Iceberg vs Paimon: The Core Design Difference
Apache Iceberg is a snapshot-and-metadata-tree format: every write produces immutable data files and a new metadata snapshot, with row-level changes handled by copy-on-write or merge-on-read delete files. Apache Paimon is an LSM-tree format: writes land in sorted runs, a background compaction merges them, and a primary key lets it upsert and emit changelog. Iceberg optimizes large analytical tables; Paimon optimizes streaming CDC.

Figure 1: Iceberg layers a catalog pointer over a metadata-file tree of manifest lists, manifests, and immutable data plus delete files. Paimon layers a catalog over per-bucket LSM-trees of sorted runs that compaction merges, with an optional changelog stream.
The diagram captures why the two behave so differently under load. In Iceberg, the unit of change is a snapshot. The current table state is a single metadata JSON file pointing to a manifest list, which points to manifest files, which list the data files and any delete files valid for that snapshot. A commit is an atomic swap of the metadata pointer. Nothing is mutated in place; readers see a consistent snapshot and writers never block readers. This is beautiful for tables that change in large batches a few times an hour, and it is the source of Iceberg’s clean time travel and rollback.
Paimon’s unit of change is a record flowing into an LSM-tree. Each table is split into buckets, and each bucket holds an LSM-tree of sorted data files organized into levels. New records — including updates and deletes keyed by the primary key — are written to the top level as small files, then compaction merges them downward, deduplicating by key. The same structure that makes a write cheap (append a small sorted run) makes Paimon natively good at upserts and at producing a changelog, because the merge step already knows the before and after of every key.
A second consequence is read-path symmetry between streaming and batch. Iceberg readers consume a single snapshot, so an Iceberg streaming read is really a sequence of incremental snapshot diffs. Paimon readers, by contrast, can subscribe to the bucket’s changelog directly or scan the merged state, so the same table serves a low-latency streaming consumer and a batch scan from one storage layout. That dual-read capability is the architectural payoff of putting the LSM-tree under the table, and it is the reason Paimon markets itself as a streaming lakehouse rather than a table format alone. It is also why a Paimon streaming lakehouse deployment usually has fewer moving parts than the equivalent Iceberg pipeline, which often needs a separate message bus to hold the change stream.
Why the metadata model dictates write latency
In Iceberg, every commit writes new metadata files and lists every relevant data file. For a table receiving thousands of tiny streaming writes, that metadata overhead explodes: you accumulate snapshots and manifests faster than you can compact them, and small-file proliferation degrades scans. Iceberg’s answer is to batch writes — Flink’s Iceberg sink commits on a checkpoint interval, trading latency for fewer, larger commits.
There is a concrete failure mode hiding in that sentence. Iceberg commits are optimistic: a writer reads the current snapshot, stages its files, and attempts an atomic pointer swap that fails if another writer committed first. Under a single batched writer this is invisible. Push the commit rate up — say two streaming jobs both checkpointing every few seconds into the same table — and commit conflicts and retries become the bottleneck, because each retry must re-read and re-validate the latest manifest set. The metadata model that gives Iceberg its clean atomicity is the same model that punishes a high commit rate, which is why “just checkpoint more often” is not a free lever.
Paimon inverts the priority. Because the LSM-tree expects a stream of small sorted runs and compacts asynchronously, frequent small writes are the design center, not an abuse of it. The cost moves to read time and to compaction, which is exactly the trade an LSM-tree makes everywhere it appears, from RocksDB to Cassandra. Crucially, Paimon partitions the write path by bucket, so two records with different keys can land in different buckets concurrently without contending for a single global commit — the contention surface is far smaller than Iceberg’s single metadata pointer.
Why the primary key changes everything
Iceberg tables historically had no enforced primary key; row-level updates were expressed through delete files matched at read time. Paimon’s primary-key table makes the key a first-class citizen of storage layout. Upserts, partial updates, and deletes are all keyed operations the merge engine resolves. This is the single largest behavioral divergence in the Iceberg vs Paimon CDC comparison, and it ripples into compaction, changelog, and query semantics — the subjects of the next section.
Consider a concrete example: a customers dimension table receiving a million updates an hour from an operational database. In Iceberg, each update is logically a delete-plus-insert; with merge-on-read the engine writes delete files and new data files, and the reader reconciles them. Over an hour that is a large pile of metadata to compact away, and until you do, scans get slower. In Paimon, those million updates are keyed upserts that the LSM-tree absorbs as small runs and collapses by key during compaction; a reader of the merged state sees one current row per customer with no fan-out of delete files. The end state is identical; the path and the operational cost are not. That divergence — same result, different mechanics — is the through-line of this entire comparison.
The primary key also unlocks partial-column updates, which are awkward in Iceberg and native in Paimon. Imagine an order record enriched in stages: the operational system writes order_id, status, and amount; a downstream job later attaches a risk_score; a third attaches a shipping_eta. In Iceberg you would express each enrichment as a full row rewrite or a merge-into statement that re-reads the existing row, because there is no notion of “update only these columns for this key.” In Paimon, the partial-update merge engine lets each writer emit only the columns it owns, and compaction stitches the latest non-null value per column per key into a single coherent row. For multi-source enrichment pipelines this is a structural simplification, not a convenience — it removes an entire class of read-modify-write coordination that Iceberg forces on you.
Deeper Analysis: Compaction, Changelog, and Catalogs
This is where the abstract “snapshot versus LSM” difference becomes operational. Three mechanisms — how each format compacts, how it produces a change stream, and how it plugs into catalogs and engines — decide whether a workload feels native or bolted on.

Figure 2: Paimon writes new records as small level-0 files, then compaction merges them into larger lower-level sorted runs, deduplicating by primary key and applying the configured merge engine.
Compaction: amortized merge versus rewrite
Paimon’s compaction is the LSM merge: it reads overlapping sorted runs, merges them by key applying the merge engine (deduplicate keeps the last row; partial-update progressively fills columns; aggregation rolls values up; first-row keeps the earliest), and writes fewer, larger files. Write amplification is the well-known LSM cost — each record may be rewritten several times as it sinks through levels — and tuning compaction frequency is the central operational lever. Too lazy and reads slow down from too many runs; too eager and you burn CPU and IO.
It pays to be precise about why read latency degrades when compaction falls behind. A point lookup or range scan in an LSM-tree must consult every sorted run that could contain the key, because a newer run may shadow an older one. With a handful of runs that is cheap; with dozens of un-compacted level-0 files it becomes a fan-out read where each candidate file is opened and its index consulted. This is the LSM “read amplification” cost, and it is the direct, mechanical reason a Paimon table whose compaction has stalled feels sluggish on reads even though writes are still fast. Bloom filters on each run mitigate the cost for point lookups by cheaply ruling out runs that cannot hold the key, but range scans get no such relief — they must merge across all overlapping runs.
Iceberg compaction is a maintenance action, not an intrinsic part of every write. You run rewrite_data_files to combine small files and rewrite_manifests to consolidate metadata, typically on a schedule. For merge-on-read tables you also compact away delete files. Iceberg v3’s deletion vectors change the economics here: instead of many positional delete files, v3 stores one compact bitmap (a Puffin file) per data file per snapshot, giving O(1) “is this row deleted” lookups and up to roughly 10x faster DML than copy-on-write in vendor benchmarks. That narrows — but does not erase — Paimon’s streaming-update advantage.
The deeper point is who owns compaction timing. With Iceberg you schedule it as a separate job and reason about its interaction with concurrent writers and snapshot expiration; a missed or mistuned maintenance schedule quietly degrades the table until someone notices slow queries. With Paimon, compaction is continuous and can run inside the writing job or as a dedicated compaction job, so the table is more self-maintaining — but you pay for it continuously in CPU and IO whether or not the read side currently needs it. Neither is free; one defers the cost to a scheduled batch, the other amortizes it into the stream.
A frequently overlooked operational choice is where Paimon’s compaction runs. Inline compaction inside the writing Flink job is simplest but couples write throughput to compaction load: a compaction-heavy moment will backpressure ingestion. A dedicated compaction job decouples the two — ingestion stays fast while a separate job grinds through merges — at the cost of running and monitoring a second job and reasoning about the lag between them. High-throughput Paimon deployments almost always split compaction out for exactly this reason, which is itself a sign that “self-maintaining” is a relative claim: Paimon maintains itself, but you still own the compute budget and the topology that pays for it.
Changelog producers: Paimon’s signature capability
The feature that most cleanly separates the two is Paimon’s changelog producer. Downstream streaming consumers need a complete, correct change stream — every insert, update-before, update-after, and delete. Paimon offers four strategies:
- None — no changelog; consumers see only the merged table state.
- Input — pass input rows straight through as changelog; only correct when the source is already a complete changelog such as a database binlog, and the cheapest option.
- Lookup — before each commit, perform batch point lookups to compute the precise change for each key, triggering compaction and emitting complete changelog; the generally recommended default.
- Full-compaction — diff the results of successive full compactions and emit the difference; correct for any source but high-latency and expensive.
The cost ladder across these four is steep and worth internalizing. Input adds essentially nothing because it forwards what it already has, but it is only correct when the upstream is genuinely a complete changelog. Lookup pays for a batch of point lookups per commit to reconstruct the precise before-image of each changed key, which is moderate and predictable. Full-compaction pays for an entire compaction pass and a diff of its output, which is the most expensive and the highest-latency, justified only when no cheaper option is correct for your source. Choosing among them is therefore a correctness-first, cost-second decision: identify the cheapest producer that is provably correct for your specific source semantics, never the cheapest one that merely seems to work in a quick test.
Iceberg’s path to the same outcome is newer: v3 row lineage assigns stable row IDs and sequence numbers, enabling native incremental reads and CDC without bolt-on tooling. It is a real capability, but Paimon’s changelog producers are more mature and more configurable for sub-minute streaming pipelines today.
A back-of-envelope on the cost difference is instructive (these numbers are illustrative, not benchmarked, and your mileage will vary with hardware and tuning). Suppose a pipeline ingests 50,000 mutations per second into a primary-key table. With Iceberg you might checkpoint the Flink sink every 30–60 seconds to keep commit and metadata overhead sane, which caps freshness at roughly that interval and produces large, infrequent snapshots that downstream consumers see in bursts. With Paimon the same stream lands continuously into level-0 runs, and a lookup changelog producer can surface a complete change stream within a few seconds of ingest, at the cost of continuous compaction CPU. The lever you are actually pulling is freshness-versus-compute: Iceberg spends less compute and delivers coarser freshness; Paimon spends more compute and delivers finer freshness. Quantifying that trade for your mutation rate, key cardinality, and freshness SLO is the single most useful experiment before committing.

Figure 3: A database binlog flows through Flink into a Paimon primary-key table; the changelog producer emits a complete change stream that downstream streaming jobs and batch engines both consume.
Catalog and engine support
A table format is only as useful as the engines that read it. Here is the practical 2026 picture.
| Capability | Apache Iceberg | Apache Paimon |
|---|---|---|
| Core storage model | Snapshot + manifest tree | LSM-tree per bucket |
| Native primary key | No (delete files / v3 row lineage) | Yes, first-class |
| Streaming upsert latency | Checkpoint-batched (tens of seconds) | Near-real-time (LSM append) |
| Changelog / CDC out | v3 row lineage incremental reads | 4 changelog producers |
| Row-level deletes | Deletion vectors (v3) | LSM tombstones |
| Spark read/write | Mature, first-class | Read + write, maturing |
| Trino / Presto read | Mature | Supported, maturing |
| Flink integration | Sink + source | Deepest of any format |
| Catalogs | REST, Hive, Glue, JDBC, Nessie | REST, Hive, JDBC, filesystem |
| Vendor adoption | Snowflake, Databricks, BigQuery, AWS | Alibaba, streaming-first shops |
| Spec maturity (2026) | v3 GA across vendors | 1.x, Iceberg-compatible output |
The decision matrix above is the heart of the apache iceberg vs paimon call. Read it as a gradient, not a scoreboard: Iceberg leads on breadth of engines and vendor-managed catalogs; Paimon leads on streaming ingest and CDC ergonomics. Notably, Paimon can emit Iceberg-compatible snapshots, so a Paimon table can be read by Iceberg-aware engines — a hedge that blurs the boundary. For the engines that query these tables, see our Trino vs Presto vs Spark lakehouse query engines guide, and for the analytical stores downstream, the ClickHouse vs Doris vs StarRocks OLAP comparison.
One catalog nuance is worth stressing because it affects governance more than performance. Iceberg’s REST catalog has become a de facto interoperability layer — multiple vendors implement it, so a single catalog endpoint can broker access for many engines with consistent credential vending and table-level governance. Paimon supports a REST catalog too, but its most battle-tested catalogs are Hive Metastore, JDBC, and filesystem, and its uniquely deep tie is to Flink, where it is the only catalog that backs Flink materialized tables. If your platform strategy centers on a vendor-managed, multi-engine catalog with fine-grained access control, that is an Iceberg-leaning signal independent of the storage mechanics.
The Iceberg-compatible-output bridge deserves a caution, because teams over-read it. When Paimon emits Iceberg-compatible snapshots, it is exposing a read-only Iceberg view of its merged state — Iceberg-aware engines can scan it, but they cannot write back through that interface, and the Iceberg view trails the live Paimon table by a compaction-and-snapshot interval. So the bridge is genuinely useful for fanning a Paimon-ingested table out to a broad analytical readership without a second pipeline, but it is not a true bidirectional dual-format table. Treat it as “Paimon is the owner; Iceberg is a derived read surface,” and you will set the right expectations with downstream teams.
Bucketing and tuning: where Paimon teams actually spend time
A Paimon detail with no Iceberg analog is bucket count. Each primary-key table is hash-partitioned into a fixed or dynamic number of buckets, and the bucket is the unit of LSM parallelism. Set it too low and write parallelism and compaction throughput bottleneck; set it too high and each bucket’s LSM-tree holds too little data, fragmenting into many tiny files and inflating metadata. Dynamic bucketing eases this, but rescaling buckets on an existing table is a non-trivial operation, so capacity-planning bucket count against expected key volume and parallelism is real upfront work. Iceberg has no equivalent knob because it does not maintain a per-key sorted structure; its tuning surface is partition spec design, target file size for rewrite_data_files, and snapshot retention. The point is not that one has more knobs — it is that the knobs are different in kind, and a team fluent in Iceberg maintenance will not automatically be fluent in Paimon LSM tuning.
A worked rule of thumb makes the bucket-count problem concrete. Buckets are the unit of parallel writing and compaction, so a sensible starting point is to size them so each bucket receives a manageable steady-state data volume — many teams aim for buckets in the low single-digit gigabytes of compacted data each, then set the count so that buckets ≈ total table size / target-bucket-size, rounded up to comfortably exceed the parallelism of the writing job so no slot starves. The trap is that this is sized for today’s volume; a table that grows tenfold over a year will find its once-reasonable buckets bloated, with deep LSM-trees and long compaction merges. Because rescaling buckets means reorganizing data, the durable advice is to provision for projected growth and to prefer dynamic bucketing where the access pattern allows it, so the system can absorb growth without a manual reshard. None of this has any analog on the Iceberg side, which is precisely why “we already run Iceberg” does not transfer to Paimon operations.
Trade-offs, Gotchas, and What Goes Wrong
Neither format is free of sharp edges, and the failures are predictable once you know the design.
Iceberg’s small-file and metadata bloat. Point Flink at an Iceberg sink with a short checkpoint interval and you generate thousands of tiny files and a snapshot per checkpoint. Without aggressive rewrite_data_files and snapshot expiration, scans crawl and metadata listing dominates query planning. The fix is operational discipline, but it is discipline you must own.
Paimon’s read-side and write amplification. The LSM-tree that makes upserts cheap makes large full-table batch scans pay a merge cost, and write amplification consumes CPU and IO under heavy update load. Mis-tuned bucket counts are a classic foot-gun: too few buckets bottleneck parallelism, too many fragment the LSM-trees into inefficient tiny files.
Changelog correctness pitfalls. The changelog producer choice is correctness-critical, not just performance. Using input on a source that is not a complete changelog silently emits an incorrect stream. There are also known edge cases — for example, lookup changelog with deletion vectors can, in certain re-insert-after-delete sequences, fail to produce a changelog and diverge CDC consumers. Validate your changelog end-to-end against ground truth before trusting it.
Snapshot and changelog retention interact dangerously. A subtle Paimon failure is configuring a downstream consumer to read the changelog while expiring snapshots aggressively to control storage. If the consumer falls behind further than the retention window, the changelog it needs is already gone, and the consumer either errors or silently resumes from a later point, dropping changes in between. The same class of bug exists on the Iceberg side as snapshot expiration racing a long-running incremental reader. Retention is not just a storage-cost knob; it is a correctness boundary for every streaming reader, and it must be sized against your slowest consumer’s worst-case lag, not its average.
Ecosystem maturity asymmetry. Iceberg’s multi-engine, multi-vendor support is battle-tested at petabyte scale across clouds. Paimon’s batch-analytics integrations with Spark and Trino are real but younger; you will hit fewer paved roads and more configuration outside the Flink-centric happy path.
Schema evolution and time-travel parity. Iceberg’s schema evolution — adding, dropping, renaming, and reordering columns without rewriting data, plus partition-spec evolution — is among the most mature in the ecosystem, and its snapshot model gives clean time travel and rollback by construction. Paimon supports schema evolution and snapshot-based time travel too, but the LSM merge semantics mean you must reason about how historical reads interact with compaction and snapshot expiration. If auditable rollback and long time-travel windows are core requirements, validate them explicitly on Paimon rather than assuming parity.
The “open” trap. Both are open, but Paimon’s center of gravity is the Flink ecosystem and Alibaba-led development, while Iceberg has genuinely diverse governance across multiple cloud and analytics vendors. That governance diversity is itself a risk-mitigation property: it reduces the chance that one vendor’s roadmap dictates the format’s direction. Weigh that against your team’s existing stack and risk tolerance.
Practical Recommendations
Start from the dominant access pattern, not the brand. If your table is fed by continuous CDC or streaming upserts and consumed by downstream streaming jobs that need a change stream, Paimon’s LSM-tree and changelog producers are the native fit and will cost you far less engineering than forcing Iceberg into that shape. If your table is a large analytical asset written in periodic batches and queried broadly across Spark, Trino, Snowflake, and cloud warehouses, Iceberg’s neutrality, vendor support, and v3 deletion vectors make it the safer default.

Figure 4: A decision path. If the dominant access pattern is streaming upserts and CDC, choose Paimon for its LSM-tree and changelog producers; if it is periodic batch analytics across many engines, choose Iceberg for its snapshots and broad engine support; when both coexist, run a hybrid where Paimon ingests and emits Iceberg-compatible snapshots for the analytical readership.
The flowchart above encodes the single most important habit in this decision: branch on the workload’s commit cadence and consumption pattern before you weigh any feature. The hybrid leaf is not a cop-out — it is increasingly the correct answer for organizations that genuinely have both shapes of table. Ingesting with Paimon for low-latency upserts and exposing an Iceberg-compatible read surface for the broad analytical readership lets one storage layout serve both audiences, rather than maintaining two pipelines that must be kept consistent. Just remember the caveat from earlier: the Iceberg surface is a derived, slightly-trailing read view, so the hybrid is right when the analytical readers tolerate snapshot-interval freshness, and wrong when they need the same second-latency the streaming consumers get.
When both patterns coexist, exploit the convergence: ingest with Paimon for low-latency upserts and emit Iceberg-compatible snapshots for the broad analytical readership, rather than running two pipelines. Avoid choosing on benchmark headlines — the LSM versus snapshot trade is structural and will not be benchmarked away.
Decision checklist:
- [ ] Classify the workload: streaming-upsert/CDC versus periodic-batch-analytics.
- [ ] List the engines that must read the table today and in 18 months.
- [ ] Decide whether you need a native primary key and a change stream out.
- [ ] If Paimon: pick the merge engine and changelog producer, and load-test bucket count.
- [ ] If Iceberg: schedule
rewrite_data_files, manifest rewrite, and snapshot expiration; adopt v3 deletion vectors. - [ ] Validate changelog/CDC correctness against ground truth before production.
- [ ] Size snapshot and changelog retention against your slowest consumer’s worst-case lag, not its average.
Frequently Asked Questions
Is Apache Paimon a replacement for Apache Iceberg?
Not exactly — they optimize for different workloads. Paimon is purpose-built for streaming upserts and CDC via its LSM-tree and changelog producers, while Iceberg is the broader, more vendor-supported standard for large analytical tables. Many teams run both, and because Paimon can emit Iceberg-compatible snapshots, a single Paimon table can serve low-latency ingest and be read by Iceberg-aware engines, reducing the pressure to pick only one.
Which format is better for change data capture pipelines?
Paimon currently has the edge for CDC. Its primary-key tables and four changelog producers (none, input, lookup, full-compaction) are designed to absorb database binlogs and emit a complete, correct change stream at low latency. Iceberg v3 added row lineage for native incremental reads, which is real progress, but Paimon’s changelog tooling is more mature and configurable for sub-minute streaming consumers in 2026.
Does Iceberg v3 close the gap with Paimon on updates?
It narrows it. Iceberg v3 deletion vectors store one compact bitmap per data file instead of many positional delete files, giving fast row-level deletes and reportedly up to about 10x faster DML than copy-on-write. That makes Iceberg far better at frequent updates than before. But the per-commit snapshot model still favors batched writes, so Paimon’s LSM-tree remains the more natural home for continuous streaming upserts.
Can Spark and Trino query Paimon tables?
Yes. Paimon supports reads and writes from Spark and reads from Trino and Presto, alongside its deepest-in-class Flink integration. The integrations are functional and improving, but they are younger and less paved than Iceberg’s, which is first-class across Spark, Trino, Snowflake, Databricks, BigQuery, and AWS analytics services. If broad multi-engine analytics is your priority, Iceberg is the lower-friction choice today.
What is an LSM table format and why does it matter here?
An LSM (Log-Structured Merge-tree) table format writes incoming records as small sorted runs and merges them in the background via compaction, deduplicating by key. It makes writes and upserts cheap at the cost of read-time merging and write amplification. Paimon uses this structure, which is why it natively handles streaming updates and changelog generation that a snapshot-based format like Iceberg handles only by batching.
How do catalogs differ between the two formats?
Iceberg supports a wide catalog ecosystem — REST, Hive Metastore, AWS Glue, JDBC, and Nessie — with several vendor-managed options. Paimon supports REST, Hive Metastore, JDBC, and filesystem catalogs, and is the most deeply integrated catalog for Flink, including backing materialized tables. Iceberg’s catalog breadth is wider for cloud and multi-engine governance; Paimon’s catalog is tightly coupled to its streaming-lakehouse strengths.
Further Reading
- Flink vs Spark Streaming vs Kafka Streams comparison (2026) — choosing the engine that feeds your table format.
- ClickHouse vs Doris vs StarRocks OLAP ADR (2026) — the analytical stores that often sit downstream of the lake.
- Trino vs Presto vs Apache Spark lakehouse query engines (2026) — what actually queries these tables.
- Apache Iceberg specification — authoritative source for snapshots, manifests, and v3 deletion vectors.
- Apache Paimon documentation — primary-key tables, merge engines, and changelog producers.
By Riju — about
