Iceberg vs Delta vs Hudi for IIoT Lakehouse: An ADR

Iceberg vs Delta vs Hudi for IIoT Lakehouse: An ADR

Iceberg vs Delta vs Hudi for IIoT Lakehouse: An ADR (2026)

Choosing between Iceberg vs Delta vs Hudi is the single most consequential decision your industrial data platform will make this decade, because the open table format underneath your lakehouse dictates how you ingest billions of sensor readings, how you handle the schema drift that every fleet of devices eventually produces, and which query engines your analysts and ML teams can ever use. Get it wrong and you are locked into a single compute vendor, fighting small-file storms, or rewriting petabytes during a migration two years from now. This is exactly the kind of decision that deserves an Architecture Decision Record — a dated, immutable artifact that captures the context, the options, the trade-offs you accepted, and the consequences you will live with. This post is that ADR, written for the high-volume, append-mostly, late-arriving reality of industrial IoT and time-series data.

What this covers: the ADR structure (Context, Decision Drivers, Options, Comparison, Decision, Consequences), a weighted decision matrix, the write-path internals that actually separate these formats, and a clearly-reasoned recommendation for the IIoT context — while being honest that the right answer depends on your stack.

Why an ADR for the lakehouse table format?

An open table format ADR matters because the choice is durable, expensive to reverse, and touches every team that reads or writes industrial data. The table format is the contract between your object store and every engine — Spark, Flink, Trino, and your warehouse — so picking it casually means relitigating it for years. Writing it down forces the trade-offs into the open and gives future engineers the receipt for why.

An Architecture Decision Record is not a design doc and it is not marketing. It is a short, dated document that captures one significant choice: the forces that pressured it, the alternatives weighed, the decision taken, and the consequences accepted. The reason this particular decision needs one is that the three leading open table formats — Apache Iceberg, Delta Lake, and Apache Hudi — have converged enough that a glossy feature checklist will tell you they are all “ACID transactions on object storage with time travel and schema evolution.” That checklist is true and useless. The differences that matter for industrial data live in the write path, the metadata layout, the catalog model, and the breadth of engines that can read your tables without a translation layer.

For an IIoT lakehouse the stakes are specific. You are not running a few hundred analyst queries a day against curated dimensions. You are landing telemetry from thousands of devices, each emitting at sub-second to minute cadence, with gateways that buffer and replay late data, firmware updates that add or rename fields mid-stream, and downstream consumers that range from Grafana dashboards to anomaly-detection models to PLM and digital-twin systems. The table format has to survive all of that without operational heroics. That is the lens for everything below.

Context: How the lakehouse table format became the decision

The “lakehouse” pattern emerged to end a decade-long split between cheap-but-dumb data lakes (files on object storage, no transactions, no schema enforcement) and expensive-but-rigid warehouses. The open table format is the layer that makes a directory of Parquet files behave like a real table — atomic commits, snapshot isolation, schema and partition metadata, and time travel — while keeping the data in open formats on storage you control.

Three projects won the field. Apache Iceberg originated at Netflix to fix the correctness and scale problems of Hive tables, and it became an Apache top-level project with a deliberately engine-neutral design. Delta Lake originated at Databricks, was open-sourced under the Linux Foundation, and is the most battle-tested format on Spark. Apache Hudi originated at Uber to solve a problem the other two initially ignored: efficient record-level upserts and incremental processing on a lake, driven by CDC and mutable streaming workloads.

By 2026 the competitive picture has shifted from “which format has feature X” to “which ecosystem and catalog will the rest of the industry standardize on.” The Iceberg REST catalog specification has become a de-facto interoperability layer, with multiple vendors and the major cloud warehouses offering Iceberg-compatible catalogs. Delta Lake’s UniForm (Universal Format) generates Iceberg and Hudi metadata alongside Delta metadata so that a Delta table can be read as an Iceberg table by foreign engines. Hudi continues to differentiate on write efficiency for mutable data and has invested in broader engine reads. The formats are converging on capabilities while diverging on where their center of gravity sits — and that center of gravity is what an IIoT platform must align with.

Figure 1: IIoT lakehouse architecture showing ingest into object storage, the open table format layer, a catalog, and multiple query engines
Figure 1 — A reference IIoT lakehouse: streaming and batch ingest land data in object storage, the table format plus catalog turn files into transactional tables, and Spark, Flink, and Trino read the same data.

Two adjacent decisions frame this one. Your hot-path time-series store — see our comparison of InfluxDB vs TimescaleDB vs ClickHouse for IoT — handles low-latency operational queries, while the lakehouse is the durable, multi-engine analytical tier behind it. And how raw telemetry reaches the lake often runs through Kafka with tiered storage via KIP-405, which changes the economics of buffering before the table format ever sees a row.

Decision Drivers

Before evaluating options, the ADR names the forces. For an IIoT lakehouse these are the decision drivers, roughly in priority order:

  • Append-mostly, high-volume ingestion. Telemetry is overwhelmingly inserts. The format must commit huge volumes of new rows cheaply and without metadata bloat. Upsert capability matters, but it is secondary to sustained append throughput for most sensor data.
  • Late-arriving and out-of-order data. Field gateways buffer during connectivity loss and replay hours later. The format must absorb late writes into the correct partitions without expensive full-partition rewrites.
  • Schema drift. Firmware revisions add sensors, rename fields, and change types. Schema evolution must be safe, metadata-only, and not require rewriting historical data.
  • Partition strategy that survives growth. Time-based partitioning is natural for telemetry, but the right granularity changes as volume grows. Re-partitioning a multi-petabyte table without rewriting it is a hard requirement at scale.
  • Streaming ingestion. Flink and Spark Structured Streaming write continuously. The format’s streaming write and exactly-once semantics matter.
  • Multi-engine interoperability. Industrial data is consumed by dashboards (Trino/Presto), ML (Spark/Ray), real-time (Flink), and increasingly the warehouse and digital-twin/PLM tools. Lock-in to one engine is a strategic risk.
  • Small-file and compaction management. Streaming + late data produces many small files. The format’s compaction and clustering story determines query performance and storage cost.
  • Catalog and governance. A REST-style, engine-agnostic catalog with lineage and access control is increasingly table stakes for a regulated industrial environment.
  • Operational complexity and ecosystem momentum. The format you pick must still be widely supported and hired-for in three years.

Options Considered

We evaluate the three open table formats against those drivers. The internal structures differ more than the marketing implies; understanding them is the whole point of the ADR.

Figure 2: Table-format internals comparing data files, manifest and log metadata, and snapshot or version pointers across Iceberg, Delta, and Hudi
Figure 2 — Internals at a glance: all three track data files plus a metadata layer, but Iceberg uses manifest lists and snapshots, Delta uses an ordered transaction log, and Hudi uses a timeline plus file groups.

Option A — Apache Iceberg

Iceberg models a table as a tree of immutable metadata: a current snapshot points to manifest lists, which point to manifests, which list the data files and their statistics. Every commit produces a new snapshot atomically, giving clean snapshot isolation and time travel by snapshot ID or timestamp.

Iceberg’s two signature features are hidden partitioning and partition evolution. Hidden partitioning means the partition transform (for example, days(event_time)) is recorded in metadata, so queries filter on the raw column and Iceberg derives the partition pruning automatically — writers and readers never hand-craft partition paths, eliminating an entire class of human error common in Hive-style layouts. Partition evolution lets you change the partition spec (say, from daily to hourly) going forward without rewriting historical data; old data keeps its old spec and new data uses the new one. For telemetry whose volume grows over years, this is uniquely valuable.

Iceberg has the broadest engine-neutral support: Spark, Flink, Trino, Presto, Dremio, Snowflake, BigQuery, and others read and write it natively. The REST catalog specification has become an interoperability standard that decouples the table from any single metastore. Schema evolution is full and metadata-only (add, drop, rename, reorder, widen types). Row-level updates and deletes are supported via merge-on-read delete files, with copy-on-write also available. Its historical weakness was that upsert ergonomics and small-file handling required tuning, though maintenance procedures (compaction, snapshot expiration, manifest rewriting) are mature.

Authoritative reference: the Apache Iceberg table specification and documentation detail the snapshot, manifest, and partition-evolution model.

Option B — Delta Lake

Delta represents a table as a directory of Parquet data files plus an ordered transaction log (_delta_log) of JSON commits, periodically checkpointed to Parquet for fast state reconstruction. Each commit is atomic; readers reconstruct the current table state by replaying the log. This log-centric design is simple, robust, and extremely well-optimized on Spark.

Delta’s strengths cluster around the Spark and Databricks ecosystem, where it is the default and the most production-hardened. Delta Kernel is a library that lets non-Spark engines read and write Delta correctly without reimplementing the protocol, broadening engine support. Liquid clustering replaces rigid hive-style partitioning with an adaptive clustering scheme that handles skew and changing access patterns without manual repartitioning — directly relevant to telemetry whose query patterns evolve. UniForm is the convergence play: a Delta table can expose Iceberg (and Hudi) metadata so that foreign engines read it as if it were native Iceberg, blunting the interop disadvantage Delta historically had outside Spark.

Delta supports full schema evolution and MERGE for upserts, with strong performance on Spark. Its historical limitations for an IIoT context: partition evolution is not a first-class metadata-only operation the way Iceberg’s is (liquid clustering is the answer, but it is newer), and the deepest tooling and performance still assume a Spark-centric stack. Outside Spark the experience is improving fast via Kernel and UniForm but is not yet as uniformly native as Iceberg’s.

Authoritative reference: the Delta Lake documentation covers the transaction-log protocol, UniForm, and clustering.

Option C — Apache Hudi

Hudi was designed first for record-level upserts and incremental processing. A Hudi table organizes data into file groups keyed by a record key, and tracks all actions on a timeline. Its defining choice is two table types: Copy-on-Write (CoW) rewrites the affected data file on every update (read-optimized, write-amplified) and Merge-on-Read (MoR) writes updates to compact log/delta files and merges them at read time, then compacts asynchronously (write-optimized, read-amplified until compaction). This MoR design is purpose-built for high-frequency mutations and low-latency ingestion.

For IIoT, Hudi’s appeal is the mutable and CDC story: if your industrial data involves frequent corrections, dimension-like device-master updates, deduplication on a record key, or near-real-time upserts from a CDC stream, Hudi handles it with less write amplification than the alternatives. It has strong streaming ingestion (Hudi Streamer/DeltaStreamer), built-in indexing to locate records for updates, and automatic file sizing and asynchronous compaction/clustering tuned for streaming. Schema evolution and incremental queries (“give me everything that changed since commit X”) are first-class — and incremental pulls are a genuinely useful primitive for downstream pipelines.

Hudi’s trade-offs: the operational model (timeline, indexes, compaction services, table services) has more moving parts, so ops complexity is higher. Multi-engine read support is real (Spark, Flink, Trino, Presto, and others) but the richest experience is on Spark and Flink; pure append-only analytics on Trino can be simpler on Iceberg. For overwhelmingly append-only telemetry, Hudi’s upsert machinery is capability you may pay for in complexity without fully using.

Authoritative reference: the Apache Hudi documentation explains CoW vs MoR, the timeline, and indexing.

Figure 3: Write-path comparison contrasting copy-on-write full-file rewrite, merge-on-read delta files, and append-only commits
Figure 3 — Write paths diverge: copy-on-write rewrites whole files on update, merge-on-read appends delta/log files and merges later, and pure appends simply add new data files plus a metadata commit.

Comparison: a weighted decision matrix

The ADR’s job is not to crown a universal winner — it is to score the options against your drivers with explicit weights. The weights below reflect the IIoT/time-series context: high-volume, append-mostly, late-arriving, schema-drifting, multi-engine. A different context (CDC-heavy, mutable, single-engine Databricks shop) would reweight and flip the result.

Scores are 1–5 (5 = strongest), qualitative and as of mid-2026. They reflect design fit and ecosystem maturity, not benchmark numbers — anyone quoting a single throughput figure across these formats is selling something.

Criterion Weight Iceberg Delta Hudi
Schema evolution 10% 5 4 4
Partition evolution / hidden partitioning 12% 5 3 3
Upsert / merge performance 10% 3 4 5
Streaming ingestion 12% 4 4 5
Compaction / clustering 11% 4 4 5
Catalog / REST catalog 12% 5 3 3
Engine interop (Spark / Flink / Trino) 15% 5 4 4
Ecosystem momentum 10% 5 5 3
Ops complexity (higher score = simpler) 8% 4 4 3
Weighted total 100% 4.49 3.86 3.92

Figure 4: Decision matrix visualized as a weighted scoring flow across the nine criteria for Iceberg, Delta, and Hudi
Figure 4 — The weighted decision matrix rendered as a scoring flow; weights reflect an append-mostly, multi-engine IIoT context and would shift for a mutable, single-engine workload.

A few honest caveats about the matrix. First, the weights are the argument — change them and the ranking changes. If “upsert/merge performance” and “streaming ingestion” dominated your context (a CDC-driven mutable platform), Hudi pulls ahead. If your entire stack is Databricks/Spark, Delta’s home-field advantage and liquid clustering make the interop and partition-evolution gaps far less relevant, and it leads. Second, the gaps are closing: UniForm narrows Delta’s interop deficit, Iceberg’s merge-on-read deletes narrow its upsert deficit, and all three keep adding clustering and catalog features. Third, “ecosystem momentum” is a judgment call about hiring, vendor support, and the REST catalog gravity — reasonable people weight it differently.

How to read the streaming and compaction rows

For streaming ingestion, Hudi scores highest because its MoR table type and built-in indexing were designed for continuous, mutable writes with bounded write amplification. Iceberg and Delta both support streaming writes well, but for the mutable streaming case they do more work. For pure append streaming — the dominant IIoT pattern — all three are strong, which is why this row does not dominate the total.

For compaction and clustering, Hudi’s automatic file sizing and asynchronous table services are the most turnkey for the small-file problem that telemetry creates. Iceberg and Delta require running compaction (Iceberg’s rewrite procedures, Delta’s OPTIMIZE/liquid clustering) but both are mature. The difference is “automatic by default” (Hudi) versus “scheduled maintenance you own” (Iceberg/Delta) — a real ops consideration.

Decision

Status: Accepted (context-dependent). For a greenfield IIoT/time-series lakehouse characterized by high-volume, append-mostly ingestion, late-arriving and out-of-order data, ongoing schema drift, time-based partitioning that must evolve as volume grows, and a requirement to serve multiple engines (Spark for ML, Flink for streaming, Trino for interactive analytics, plus warehouse and digital-twin/PLM consumers), we choose Apache Iceberg as the default open table format.

The reasoning, tied to the drivers:

  1. Partition evolution and hidden partitioning are decisive for telemetry. Industrial data volume grows by orders of magnitude over a platform’s life. Iceberg lets us start with daily partitions and move to hourly (or add a device/site transform) without rewriting history — a capability neither alternative matches as a first-class, metadata-only operation. Hidden partitioning also removes a whole category of writer/reader partition-path bugs.

  2. Engine neutrality and the REST catalog are strategic insurance. Our consumers are heterogeneous and will change. Iceberg’s broad native support and the REST catalog standard mean our tables are not hostage to one compute vendor. For a platform expected to outlive several analytics tools, this is the highest-leverage property.

  3. Append-mostly workloads do not need Hudi’s upsert machinery as the default. Hudi is excellent and would win for a mutable, CDC-heavy, low-latency-upsert platform. But paying its operational complexity for capability our dominant workload barely uses is the wrong trade for the primary tables. We reserve Hudi for specific mutable sub-domains (see below).

  4. Schema evolution is the single most-used feature for drifting IIoT data, and Iceberg’s is the most complete and safest.

This decision is explicitly not universal. Choose Delta instead if your platform is Spark/Databricks-centric — there, Delta’s maturity, liquid clustering, Delta Kernel, and UniForm (which gives you Iceberg-reader interop anyway) make it the lower-risk, higher-velocity choice, and the partition-evolution gap is largely absorbed by clustering. Choose Hudi for sub-systems dominated by record-level upserts, CDC ingestion, deduplication on a record key, or incremental-pull pipelines — for example a device-registry or asset-master table, or a corrections-heavy quality table. A mature platform may run Iceberg for the append-heavy telemetry core and Hudi for the mutable edges; that heterogeneity is legitimate when the catalog ties it together.

Figure 5: IIoT streaming ingestion flow from devices and gateways through a stream buffer and stream processor into the lakehouse table format
Figure 5 — The IIoT streaming ingestion path: devices and gateways feed a stream buffer, a stream processor handles late and out-of-order events, and writes commit into the lakehouse table format with snapshot isolation.

Consequences

An ADR is only honest if it records what we are signing up for — the good and the bad.

Positive consequences

  • Vendor and engine optionality. Telemetry tables are readable by Spark, Flink, Trino, and the major warehouses without translation. If we change our query engine or warehouse, the data stays put.
  • Cheap, safe schema and partition change. Firmware-driven schema drift and growth-driven repartitioning become metadata operations, not petabyte rewrites — directly reducing the most expensive class of maintenance for long-lived telemetry.
  • Standardized catalog. The Iceberg REST catalog gives one governance and access-control surface across engines, which simplifies lineage and compliance for regulated industrial data.
  • Clean time travel and reproducibility. Snapshot isolation and snapshot-ID time travel make ML training reproducible and audits straightforward.
  • Strong upstream momentum. Broad multi-vendor investment reduces the risk of betting on a format that stalls.

Negative consequences and what we accept

  • Compaction and small-file management is our responsibility. Streaming + late data creates small files; we must schedule and own Iceberg’s rewrite/compaction and snapshot-expiration jobs. Hudi would have automated more of this. We accept this ops cost and will codify the maintenance jobs as platform infrastructure.
  • Upsert ergonomics are weaker than Hudi’s. Where we do need record-level updates (the mutable edges), merge-on-read deletes are workable but less efficient and less ergonomic than Hudi’s purpose-built path. We mitigate by routing genuinely mutable workloads to Hudi tables rather than forcing them into the Iceberg core.
  • Not the Spark/Databricks home-field default. If we run heavy Databricks workloads, we forgo some Delta-native conveniences and tooling depth. We accept this as the price of engine neutrality, and note UniForm means Delta-origin data could still be read as Iceberg if a sub-team standardizes on Delta.
  • Convergence risk cuts both ways. Because the formats are converging (UniForm, REST catalog adoption, cross-format reads), some of Iceberg’s advantages may narrow. We treat the decision as revisable on a future dated ADR, not as permanent.
  • Operational learning curve. Snapshot management, manifest rewriting, and catalog operations require team expertise we must build.

Mitigations and review trigger

We will: (1) codify compaction, snapshot expiration, and manifest-rewrite jobs as scheduled platform services from day one; (2) route mutable/CDC sub-domains to Hudi rather than over-extending Iceberg; (3) adopt a REST catalog to keep governance uniform; and (4) re-open this ADR if our workload shifts to predominantly mutable data, if we consolidate onto a single Spark/Databricks stack, or if cross-format convergence makes the interop advantage moot.

Trade-offs, gotchas, and what goes wrong

The failure modes are predictable. The most common is the small-file storm: streaming writers and frequent late-data commits produce millions of tiny files and bloated metadata, and query latency degrades until someone runs compaction. With Iceberg this is a maintenance discipline you must own; skipping it is the number-one cause of “the lakehouse got slow.”

The second is catalog sprawl — different teams pointing at different catalogs (Hive metastore here, Glue there, a REST catalog somewhere else) so the same logical table is governed three ways. Pick one catalog model and enforce it.

Third, partition-spec mistakes early on. Iceberg forgives them via evolution, but Delta and Hudi make early partition choices stickier; over-partitioning telemetry by device and hour creates millions of partitions and metadata pressure. Start coarse.

Fourth, forcing append data through upsert paths (or vice versa). Running append-only telemetry through Hudi’s upsert/index machinery wastes write amplification and indexing cost; running genuinely mutable corrections through copy-on-write rewrites whole files needlessly. Match the table type to the workload.

Finally, assuming “open” means “free interop today.” UniForm and cross-format reads are real but version-dependent and occasionally lossy on edge features. Validate the specific engine-plus-format-plus-version combination you intend to use before you depend on it in production.

Practical recommendations

Translate the decision into action:

  • Default the telemetry core to Iceberg with time-based hidden partitioning at a coarse granularity (daily), planning to evolve to hourly as volume demands — no rewrite required.
  • Stand up a REST catalog first. The catalog is the governance and interop backbone; retrofitting it later is painful.
  • Codify maintenance as infrastructure. Schedule compaction, snapshot expiration, and manifest rewrites as first-class platform jobs, not afterthoughts.
  • Route mutable sub-domains to Hudi (device master, CDC, corrections) and let the catalog unify them.
  • If you are a Databricks/Spark shop, default to Delta, lean on liquid clustering, and enable UniForm for foreign-engine reads.
  • Validate interop combinations (engine + format + version) before depending on cross-format reads.
  • Write the ADR down and date it. Record the weights you used so the next engineer can see why the decision was what it was.

Checklist: catalog chosen → partition strategy coarse → compaction scheduled → mutable workloads routed → interop validated → ADR committed.

FAQ

Iceberg vs Delta vs Hudi — which is best for IIoT and time-series data?
For high-volume, append-mostly, late-arriving telemetry consumed by multiple engines, Apache Iceberg is the strongest default, mainly because of partition evolution, hidden partitioning, broad engine neutrality, and the REST catalog. Hudi wins for mutable, CDC-heavy, upsert-driven workloads, and Delta wins for Spark/Databricks-centric platforms. The right choice is genuinely context-dependent — score it against your own weighted drivers.

What is the real difference between Iceberg, Delta, and Hudi?
All three give ACID transactions, time travel, and schema evolution on object storage. The differences are structural: Iceberg uses snapshots and manifests with metadata-only partition evolution; Delta uses an ordered transaction log and is most optimized on Spark; Hudi uses a timeline and file groups built for record-level upserts via copy-on-write or merge-on-read. Those internals, not the feature checklist, determine fit.

Does Delta Lake’s UniForm make the choice irrelevant?
Not entirely. UniForm lets a Delta table expose Iceberg (and Hudi) metadata so foreign engines can read it as Iceberg, which meaningfully narrows Delta’s interop gap. But UniForm is read-oriented and version-dependent, and it does not give Delta Iceberg’s metadata-only partition evolution. It reduces lock-in risk rather than erasing the differences.

Is hidden partitioning unique to Iceberg?
Hidden partitioning — recording the partition transform in metadata so queries filter on raw columns and the engine derives pruning — is an Iceberg signature feature. Delta addresses the same goals differently with liquid clustering, and Hudi with its file-group layout and clustering. Iceberg’s combination of hidden partitioning and metadata-only partition evolution is the distinctive pairing.

When should I choose Apache Hudi for industrial data?
Choose Hudi when record-level upserts dominate: CDC ingestion, deduplication on a record key, frequent corrections, or incremental-pull pipelines (“everything changed since commit X”). A device-registry, asset-master, or corrections-heavy quality table is a strong Hudi candidate. For overwhelmingly append-only sensor streams, Hudi’s upsert and indexing machinery is capability you pay for in operational complexity without fully using.

Can I run more than one table format in the same lakehouse?
Yes, and mature platforms often do. A common pattern is Iceberg for the append-heavy telemetry core and Hudi for mutable sub-domains, with a single catalog tying them together so governance and discovery stay uniform. The cost is more operational surface area; the benefit is matching each table type to its workload instead of compromising on one format for everything.

Further Reading

Related on this site:

External references:

Written by Riju — data-platform architect. More on the about page.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *