DuckDB vs ClickHouse for Embedded Analytics: An ADR

DuckDB vs ClickHouse for Embedded Analytics: An ADR

DuckDB vs ClickHouse for Embedded Analytics: An ADR

The choice of DuckDB vs ClickHouse keeps landing on architecture review agendas because the two engines look superficially similar — both are columnar, both are blisteringly fast at analytical scans — yet they sit at opposite ends of the deployment spectrum. DuckDB runs inside your process. ClickHouse runs as a server, usually a cluster. That single distinction propagates into every operational property you care about: concurrency, scale ceiling, failure modes, and the size of the on-call rotation you need to keep it alive.

This is written as an architecture decision record. The goal is not to crown a winner but to document the context, the drivers, the options, and the consequences clearly enough that a future engineer can understand why the decision went the way it did — and can reopen it when the assumptions change.

What this covers: the in-process versus distributed-OLAP split, embedding analytics directly inside applications, querying data lakes over Parquet and Iceberg, the role of chDB as embedded ClickHouse, a weighted decision matrix, and a workload-driven decision tree.

Context and decision drivers

The decision context is a product team that needs analytical query capability somewhere in its stack. Concretely, that breaks into a few recurring shapes: an interactive dashboard embedded in a SaaS product, an ad-hoc exploration layer over a data lake, a metrics or observability backend, or a customer-facing “analytics” feature where each tenant slices their own data.

These shapes share a workload profile — large scans, aggregations, group-bys, joins over columnar data — but differ sharply in their non-functional requirements. Those requirements are the real decision drivers, and they are what separate DuckDB vs ClickHouse in practice.

The drivers we weighted, in rough priority order:

  • Deployment surface. Does analytics need to be a service you operate, or a library you link in? Operating a stateful distributed database has a standing cost that a linked library does not.
  • Concurrency. How many simultaneous queries, and how many concurrent writers? This is the single most decisive axis.
  • Scale ceiling. Will the working set outgrow one machine within the planning horizon? Single-node engines have a hard wall; distributed engines push it out with hardware.
  • Latency at the edges. Cold-start cost, time-to-first-query, and tail latency under load.
  • Data gravity. Where does the data already live — local files, object storage as Parquet, an Iceberg or Paimon table, or inside the engine’s own storage?
  • Operational and dollar cost. Headcount to run it, infrastructure spend, and the licensing or hosting model.
  • Ecosystem fit. Drivers, SQL dialect, language bindings, and the surrounding tooling your team already uses.

A useful framing throughout: DuckDB optimizes for eliminating operational surface, while ClickHouse optimizes for absorbing scale and concurrency. Most bad outcomes come from picking the engine whose strength you do not actually need while paying for it anyway.

Two further constraints shaped the weighting. First, the team’s standing operational budget: there was no appetite to add a stateful distributed database to the on-call rotation unless a hard requirement forced it, which pushes the bar for ClickHouse-the-server higher than a feature comparison alone would suggest. Second, data was already accumulating as Parquet in object storage rather than inside any database, so an engine that treats those files as a native query target — rather than requiring an ingest step into proprietary storage — started with an advantage. These are local facts, not universal truths; a team whose data already lives in MergeTree and whose SRE org already runs distributed databases would weight the same drivers differently and land elsewhere. That is precisely why an ADR records the context: the decision is only valid against the assumptions that produced it.

The options compared

Three concrete options are on the table, not two. The pure framing of “in-process versus server” hides a third path — chDB — that gives you ClickHouse’s engine inside your process.

Deployment-model architecture for DuckDB vs ClickHouse vs chDB showing in-process engines versus a server cluster

Figure 1 contrasts the three deployment models: DuckDB linked directly into the application process reading local files and object storage, ClickHouse fronted by a network protocol and running as a server cluster over tiered storage, and chDB embedding the ClickHouse engine inside the application process.

Option A — DuckDB, in-process. DuckDB is a single-file analytical database that compiles into your application as a library. There is no server, no port, no separate process to supervise. You open a database handle, run SQL, and the work happens on the same CPU and RAM as your app. It reads its own native format, but its more common role in 2026 is as a query engine over external data — local Parquet, CSV, or remote object storage. See the DuckDB documentation for the current extension and format support.

Option B — ClickHouse, server. ClickHouse is a distributed columnar OLAP database designed to run as a long-lived service. Clients connect over the native protocol or HTTP. It owns its storage in the MergeTree family of table engines, supports replication and sharding, and is built to sustain very high ingest rates alongside large concurrent read fan-out. The ClickHouse documentation details the deployment and table-engine model.

Option C — chDB, embedded ClickHouse. chDB packages the ClickHouse query engine as an in-process library — conceptually “DuckDB-shaped ClickHouse.” You get ClickHouse SQL semantics, functions, and the MergeTree reader without standing up a server. It is the bridge option: in-process deployment with ClickHouse’s dialect and engine behavior.

The reason chDB matters to this ADR is that it breaks a false binary. Plenty of “DuckDB vs ClickHouse” debates implicitly assume that choosing ClickHouse means choosing a server and all its operational weight. chDB removes that coupling. A team that loves ClickHouse SQL but dreads running a cluster can embed the engine the same way it would embed DuckDB, then graduate to the server later if concurrency demands it — and the SQL ports across that boundary because it is the same dialect. That migration path, in-process to server within one engine family, is something DuckDB cannot offer, and it can be the deciding factor for teams that expect to grow into a server but want to start small.

The honest summary is that the DuckDB vs ClickHouse decision is really a two-step question. First: in-process or server? Then, if in-process: do you want DuckDB’s ergonomics or ClickHouse’s engine semantics via chDB?

Engine and architecture

Both engines are vectorized, columnar, and cost-based — so the high-level query path looks alike. The differences live in the storage and parallelism layers.

Query execution architecture comparing DuckDB morsel-driven parallelism with ClickHouse MergeTree parts and distributed shards

Figure 2 traces the shared query path — parse, optimize, vectorized execution, columnar storage — and shows where the two engines diverge at the parallelism and storage layers.

DuckDB uses morsel-driven parallelism: a query is split into small work units (morsels) distributed across the cores of one machine, with intermediate results that can spill to disk when they exceed memory. This makes a single node punch well above its weight, and it means a laptop can run analytical queries over datasets larger than its RAM, within reason. The engine is tuned for one machine being used fully, not many machines being coordinated.

ClickHouse stores data in MergeTree-family parts: data is written in sorted, compressed parts that background processes merge over time, with sparse primary indexes that let scans skip large ranges. Parallelism spans both cores and nodes. A query can fan out across shards, run partial aggregations locally, and combine results. This is the architecture that lets ClickHouse sustain heavy ingest and large scans at the same time — at the cost of the merge machinery and distributed coordination you now have to operate.

A practical consequence: DuckDB’s columnar engine is exceptional at reading external columnar formats. Querying Parquet directly, projecting and filtering before materializing, is a first-class path. For data-lake querying — running SQL over Parquet sitting in object storage, or over an Iceberg table — DuckDB often feels like the natural fit precisely because it carries no storage opinions of its own. ClickHouse can read those formats too, and increasingly treats Iceberg as a first-class external table, but its center of gravity remains its own MergeTree storage where it is fastest. If your table-format strategy is the deciding factor, our comparison of Apache Iceberg vs Paimon lakehouse table formats covers the storage side of this trade-off.

Concurrency and scale

This is where DuckDB vs ClickHouse stops being a close call and becomes a requirements question. The two engines have fundamentally different concurrency models, and choosing wrong here is the most expensive mistake in this space.

Concurrency and scale ceiling for DuckDB single node versus ClickHouse distributed shards

Figure 3 shows DuckDB’s single-node model with one writer at a time bounded by one machine, versus ClickHouse’s many concurrent clients fanning across horizontally scalable shards.

DuckDB is, by design, single-node and single-writer. Multiple readers can share a database, but concurrent writes are serialized, and the whole thing lives within one process and one machine’s resources. That is not a defect — it is the deal you accept in exchange for zero operational overhead. The scale ceiling is the largest single machine you are willing to provision, and the concurrency ceiling is what one process can multiplex. For a single analyst, a batch job, or a per-request embedded query that spins up, answers, and tears down, this is plenty.

ClickHouse is built for the opposite regime: many concurrent clients, high sustained write throughput, and datasets that exceed any single box. You scale by adding shards and replicas. The ceiling moves out as you add hardware, with the usual distributed-systems caveats — rebalancing, replication lag, and merge pressure all become things you monitor.

The decisive question is therefore concurrency, not raw single-query speed. Both engines are fast on a cold benchmark. But if hundreds of tenants will hit the same analytics surface simultaneously, a single in-process DuckDB handle is the wrong shape regardless of how fast each query is. Conversely, if the access pattern is “one job, one big scan, then done,” standing up a ClickHouse cluster is paying a concurrency premium you will never spend. This same single-node-versus-distributed reasoning shows up in the time-series world too, which we explore in InfluxDB vs TimescaleDB vs ClickHouse for IoT time series.

A note on benchmarks: published numbers for both engines vary widely with hardware, schema, compression, and query shape. Treat any single throughput figure with suspicion. The architecture, not a benchmark row, should drive this decision — the architecture tells you how the engine behaves under your concurrency and data-size assumptions, which is what actually matters.

Weighted decision matrix

The matrix below scores each option against the decision drivers on a 1-5 scale, where higher is better for that driver as we weighted it for an embedded-analytics context. Weights reflect this specific decision context; reweight them for yours before reusing the totals. These are directional architectural judgments, not measured benchmarks.

Decision driver Weight DuckDB (in-process) ClickHouse (server) chDB (embedded)
Low deployment surface 0.20 5 2 5
High concurrency 0.18 2 5 2
Scale beyond one machine 0.15 1 5 1
Time-to-first-query / low cold start 0.12 5 3 5
Data-lake / Parquet querying 0.12 5 4 4
Operational + dollar cost (lower is better) 0.13 5 2 4
Ecosystem & SQL dialect breadth 0.10 4 5 4
Weighted total 1.00 3.85 3.58 3.61

Read the totals carefully. They are close on average — which is exactly the point. The averages hide the fact that the engines win on different axes. DuckDB tops the table here only because this weighting prizes low deployment surface and cost; flip the weights toward concurrency and scale and ClickHouse pulls ahead decisively. The matrix is a tool for making your weighting explicit, not an oracle. A team that needs five-by-nine concurrency should set the concurrency weight high enough that the matrix returns the answer their requirements already imply.

Decision walkthrough: choosing by workload

The cleanest way to apply the above is to walk the decision by workload shape rather than by feature checklist.

Decision tree for choosing between DuckDB, ClickHouse server, and chDB by workload characteristics

Figure 4 is a decision tree that routes a workload to DuckDB, ClickHouse, or chDB based on concurrency, data size, in-process requirements, and dialect needs.

Start with the two questions that have hard answers. Do you have many concurrent users and a high sustained write rate? If yes, you want ClickHouse server — neither in-process option is built for that regime, and trying to force it leads to a queue of serialized work or a fleet of processes you end up coordinating by hand. Is the data larger than one machine within your planning horizon? If yes, again ClickHouse, because that is the wall single-node engines cannot climb.

If both answers are no, you are in embedded-analytics territory, and the question becomes how you want to embed. Does the analytics need to run inside the application process or a notebook? If yes, choose between DuckDB and chDB on dialect: if you specifically want ClickHouse SQL, its function library, or compatibility with an existing ClickHouse codebase, chDB gives you the engine in-process. Otherwise DuckDB’s ergonomics, extension ecosystem, and Parquet-first design usually make it the lower-friction choice.

If the work is not in-process but is ad-hoc querying over a data lake — Parquet files, an Iceberg table — DuckDB is again the natural fit: it attaches to the files, projects what it needs, and gets out of the way. Only when the lake query layer must serve many concurrent users does it make sense to put a ClickHouse server in front of the same data.

The pattern across the whole tree: ClickHouse wins whenever concurrency or scale is the binding constraint; the in-process options win whenever operational simplicity and locality are what you are optimizing for.

It helps to walk three concrete workloads through the tree. A nightly ETL job that reads a few hundred gigabytes of Parquet, aggregates it, and writes a summary table fits one machine, has no concurrency to speak of, and lives happily in DuckDB — provisioning a cluster for it would be pure overhead. A multi-tenant SaaS dashboard where every logged-in customer triggers aggregations against shared event data has both concurrency and a growing working set, so it routes straight to ClickHouse server; the in-process options would serialize or fragment under that load. A data-science team that wants ClickHouse-compatible SQL in their notebooks, against data that still fits a workstation, is the textbook chDB case: in-process for the ergonomics, ClickHouse dialect for portability to the production cluster their pipelines already feed. The tree is not academic — these three shapes cover the large majority of real embedded-analytics decisions, and they land in three different places.

Consequences, trade-offs, and what goes wrong

Every option here carries consequences that surface months after the decision. Documenting them is the point of an ADR.

Choosing DuckDB and then growing into a concurrency wall. The most common failure mode. DuckDB ships in week one because there is nothing to operate. Six months later the product has hundreds of tenants hitting the analytics surface, and a single-writer, single-process engine becomes the bottleneck. The fix is usually a fleet of DuckDB processes (one per request or per tenant) behind a coordination layer — at which point you have hand-rolled a worse version of what a server gives you. Mitigation: be honest up front about whether concurrency will arrive, and if it will, treat DuckDB as a deliberate stopgap with a migration trigger written down.

Choosing ClickHouse and paying the operational tax forever. ClickHouse is a distributed stateful database. Even managed, it carries real cognitive and dollar cost: schema and partition design that actually matters, merge and mutation behavior to understand, replication and shard topology to monitor. If your workload never needed the concurrency or scale, that tax buys nothing. Mitigation: only reach for the server when a driver genuinely demands it, and consider chDB if you want the dialect without the server.

Treating the data lake as solved by either engine. Both can query Parquet and Iceberg, but neither makes table-format governance — schema evolution, snapshot management, compaction — disappear. The query engine is downstream of the table-format decision, and getting the format wrong is more expensive than getting the engine wrong.

Underestimating the dialect lock-in. ClickHouse SQL and DuckDB SQL overlap but are not interchangeable. Functions, type handling, and extension behavior differ. Code written against one does not port for free. chDB exists partly to soften this for teams already invested in ClickHouse semantics.

Ignoring the observability angle. If the embedded analytics is actually a telemetry or metrics store, the ingestion pipeline matters as much as the query engine. How data arrives — batching, buffering, backpressure — shapes which engine behaves well, a topic we cover in OpenTelemetry Collector architecture and pipelines.

Conflating “embedded” with “small.” Embedded analytics does not mean the data is small; it means the engine runs inside the host process. A DuckDB instance can chew through tens of gigabytes on a well-provisioned container, spilling to disk as needed. The constraint is concurrency and the single-machine ceiling, not dataset size per se. Teams sometimes rule out an in-process engine because the dataset “sounds big,” when the real disqualifier — if there is one — is the number of simultaneous queries. Get this distinction wrong and you over-provision a cluster for a workload one node could have served.

Letting the prototype become the production architecture by default. Because DuckDB and chDB ship in an afternoon, they have a way of becoming permanent without anyone deciding they should. That is fine when the workload genuinely stays within their envelope, and a liability when it does not. The antidote is the same discipline an ADR is meant to enforce: name the assumptions, name the trigger that invalidates them, and schedule a review against the trigger rather than against a crisis. The cheapest migration is the one you planned before you needed it.

Practical recommendations

The decision compresses into a few defensible defaults:

  • Default to DuckDB for embedded, single-user, and lake-query work. If the workload is a batch job, an analyst’s exploration, a per-request in-process query, or ad-hoc SQL over Parquet and Iceberg, start here. The zero-operations property is worth a lot, and you can run it on a laptop or a single container.

  • Default to ClickHouse server when concurrency or scale is the binding constraint. Many simultaneous users, high sustained ingest, or a working set that exceeds one machine — these are the signals. Do not wait until the DuckDB wall is hit to plan the migration; decide based on the trajectory, not the current load.

  • Reach for chDB when you want ClickHouse semantics in-process. Existing ClickHouse SQL, a need for its function library, or a desire to prototype in-process before standing up a server — chDB is the bridge. It also makes a credible “embedded mode” for teams whose production analytics already runs on ClickHouse.

  • Write the migration trigger into the ADR. Whichever way you go, record the condition that would reverse the decision — a concurrency threshold, a data-size threshold, a latency SLO. That single line saves the next architecture review.

  • Do not benchmark your way to the decision. Benchmark to validate the choice the architecture already points to, and to size hardware. The deployment model, not a throughput row, is the decision.

  • Keep the two-step framing. Resolve in-process versus server first, on concurrency and scale. Only then, if in-process won, choose DuckDB versus chDB on dialect and ecosystem. Collapsing both steps into a single “which is better” question is what produces the muddled comparisons that recur in this space.

A final operational note worth recording: whichever engine you choose, the table-format and ingestion decisions around it tend to dominate long-run maintenance more than the engine itself. An engine swap is a contained migration; a poorly chosen storage format or a brittle ingestion pipeline is a recurring tax. Spend the architectural attention proportionally, and revisit this ADR when concurrency, data size, or the surrounding storage strategy crosses the triggers you wrote down.

FAQ

Is DuckDB faster than ClickHouse?
For a single analytical query on one machine, the two are often comparable, and DuckDB can win on small-to-medium datasets thanks to its in-process design and zero network hop. ClickHouse pulls ahead as concurrency, ingest rate, and data size grow beyond one machine. “Faster” is the wrong question; the right one is which engine’s deployment model fits your concurrency and scale requirements, because that determines real-world performance far more than any single benchmark.

Can DuckDB replace ClickHouse?
Only for workloads that fit one machine and modest concurrency — embedded analytics, lake querying, and single-user exploration. DuckDB cannot replace ClickHouse for high-concurrency, high-ingest, multi-node OLAP, because it is single-node and single-writer by design. Many teams use both: DuckDB at the edges and in development, ClickHouse as the central high-concurrency serving layer.

What is chDB and when should I use it?
chDB is the ClickHouse query engine packaged as an in-process library, giving you ClickHouse SQL and functions without running a server. Use it when you want ClickHouse semantics inside an application or notebook, when prototyping before committing to a server, or when an existing ClickHouse codebase needs an embedded execution mode. It occupies the same in-process niche as DuckDB but with ClickHouse’s dialect.

Can DuckDB and ClickHouse query Iceberg and Parquet directly?
Yes, both can read Parquet from local disk and object storage, and both support reading Iceberg tables to varying degrees, with that support maturing quickly in 2026. DuckDB treats external columnar files as a first-class query target with no storage opinions of its own. ClickHouse can read them but is fastest over its native MergeTree storage, so external-table querying is a capability rather than its core strength.

Do I need a server to run ClickHouse?
Not necessarily. The standard ClickHouse deployment is a server or cluster, but chDB lets you run the ClickHouse engine in-process with no server at all. If your only objection to ClickHouse is operating a server, chDB removes that objection while keeping the dialect. You give up the distributed concurrency and scale that the server provides.

How do I decide between them for a customer-facing analytics feature?
Count the concurrency. A customer-facing feature usually means many tenants querying simultaneously, which points to ClickHouse server. If instead each tenant’s data is small and queries are isolated per request, a pool of in-process DuckDB or chDB instances can work and costs far less to operate. Write down the expected concurrent-query count — that number, more than anything else, decides it.

Further reading

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *