Trino vs Presto vs Apache Spark: Query Engines for Lakehouse Analytics 2026
Choosing the right query engine is no longer a luxury—it’s a competitive advantage. Teams managing petabyte-scale data lakes now face three heavyweight contenders: Trino (the lean, interactive SQL powerhouse), Presto (the enterprise cousin with commercial backing), and Apache Spark (the data-processing juggernaut that doubles as an SQL engine). Each architecture optimizes for different workload profiles. Trino and PrestoDB favor sub-second to seconds interactive queries on lakehouse tables. Spark SQL dominates batch-processing pipelines, ML workflows, and heterogeneous data transformations. This post dissects their core architectures, benchmarks, trade-offs, and lays out a clear decision matrix so you can pick the engine that fits your use case. We’ll cover coordinator-worker topology, vectorized execution, Iceberg and Delta Lake integration, the Photon/Velox acceleration arms race, and the hidden costs that benchmarks rarely disclose.
Why distributed query engines matter for lakehouses in 2026
Lakehouse architectures (Iceberg, Delta Lake, Apache Hudi) unify storage and analytics by sitting on top of object storage (S3, GCS, ADLS). But raw parquet files don’t execute queries—you need a query engine. The 2026 landscape sees three engines fighting for dominance by competing on latency, cost per query, catalog integration, and total cost of ownership. Trino leads in sub-second OLAP interactive queries and has become the de facto standard for data catalogs like Trino Metastore and Glue. Presto offers enterprise features (fine-grained security, multiple backends, advanced connectors). Spark SQL commands the batch and ML pipeline space while adding interactive capabilities through Project Photon (GPU acceleration). Teams need to understand the architectural differences and fit them to workload characteristics.
Architecture: coordinator-worker vs driver-executor
Query engines divide the world into two patterns: stateless coordinator and fault-tolerant driver. Trino and Presto use coordinator-worker topology; Spark uses driver-executor with DAG scheduling.

Trino and Presto run identical code on a single coordinator node (master) and many interchangeable worker nodes. The coordinator receives a SQL query, parses it, optimizes the logical plan, breaks it into fragments, and distributes fragments to workers. Workers execute in parallel; results stream back to the coordinator and final aggregation happens on the coordinator. The coordinator is stateless—it holds no query state between requests. This design scales horizontally because workers are completely fungible: add a node, start the Trino/Presto service, it autodiscovers the coordinator via Consul or static config, and immediately begins accepting work. Failure of a worker simply spills that fragment’s partial work; the coordinator retries or skips the task depending on the fragment type.
Spark follows a driver-executor model where a long-lived driver process (running in your client, a Kubernetes pod, or a Databricks notebook) orchestrates execution. The driver builds a DAG (directed acyclic graph) of transformations and tasks. It assigns tasks to executor JVMs on worker nodes. Executors run in parallel, shuffle data across the network, and report results back to the driver. The driver manages state (RDD lineage, task dependencies, broadcast variables) and is a single point of coordination. If the driver dies, the job fails. This model is designed for fault tolerance at the task level (Spark retries failed tasks) but not for driver failure.
This difference has profound implications. Trino and Presto are lighter-weight, easier to scale, and better for multi-tenant interactive queries. Spark is designed for long-running, stateful jobs where the driver controls the entire flow.
Pipelined vectorized execution in Trino
Trino processes data in pages (batches of 16K rows with columnar encoding) and pipelines results between operators. A filter followed by a join runs simultaneously, not sequentially. This minimizes memory pressure and CPU cache misses. Presto borrowed this model but has fallen behind on optimization.
DAG-based execution in Spark
Spark builds a directed acyclic graph of RDDs or DataFrames and breaks them into stages. Each stage runs a set of tasks. Between stages, data is shuffled across the network to the executors that hold the next stage. Spark optimizes DAGs via Catalyst (rule-based optimizer) and Tungsten (code generation). This model is powerful for complex multi-step pipelines but adds latency for simple queries because the driver must submit stages sequentially.
Catalog integration: Iceberg, Delta, Hudi
All three engines now support Apache Iceberg as the preferred lakehouse format. Trino has first-class Iceberg support via iceberg:// connectors and handles time-travel queries natively. Spark brought Iceberg support via org.apache.spark.sql.iceberg and can optimize Iceberg metadata directly. Presto follows but lags in maturity. Delta Lake is a Databricks project, so Spark has the fastest implementation; Trino and Presto added Delta support but with some limitations on Z-ordering and advanced optimizations. Hudi support is uneven: Trino’s Hudi connector is newer and has fewer users. Spark’s Hudi integration is tighter but slower than Iceberg.
Execution models and query lifecycle
Understanding how each engine executes queries reveals the performance trade-offs.

Trino: streaming fragment execution
Trino parses the query, builds a logical plan, applies optimizer rules (predicate pushdown, projection pushdown, join reordering), and fragments the plan into distributed pieces. Each fragment runs on multiple workers, processing a partition of data. Results stream from workers to a local buffer on the coordinator. The coordinator spools results and sends them to the client in chunks. Query latency is driven by the slowest worker (straggler) and network overhead. Trino’s HTTP-based communication is well-tuned for interactive queries; it keeps connections open and batches results. Straggler mitigation comes from dynamic worker assignment: slow workers are deprioritized in favor of faster nodes.
Spark: stage-based scheduling
Spark builds a DAG of transformations and uses the Catalyst optimizer to rewrite the logical plan (filter pushdown, constant folding, column pruning). Then it breaks the DAG into stages, where each stage contains task-parallel operations that don’t require a shuffle. The driver submits stage 1, waits for completion, then submits stage 2. Each task runs on an executor, reads its input partition, applies the operation, writes intermediate results to disk (if it’s the input to a shuffle), and reports success. After all tasks in a stage complete, the shuffle layer moves data to the next stage. This approach is reliable but adds latency: a simple query that could be executed in one pipelined pass requires multiple stage submissions and shuffles.
Performance implications
For a 10-second interactive query, Trino can often execute in 1–3 seconds (network overhead dominates). Spark needs at least 2–4 seconds for driver scheduling overhead plus executor startup, even with warm processes. For 100-hour batch jobs, Spark’s multi-stage fault tolerance and complex optimization strategies pay off; Trino’s simpler fragment model can be less efficient on deeply nested or multi-join queries.
Decision matrix: interactive vs batch vs hybrid
The choice between Trino, Presto, and Spark depends on workload characteristics.


Use Trino or Starburst Galaxy (Trino SaaS) when: You run 10-1000 ad-hoc SQL queries per day on a shared lakehouse, expect sub-10-second latency, need to serve 50+ concurrent users, and want minimal operational overhead. Trino shines in data exploration, BI dashboards, and interactive analytics. Cost model: pay for compute hours; Starburst Galaxy charges per query + data scanned.
Use PrestoDB when: You need Trino’s performance but also require fine-grained security (per-table row-level access controls), multi-backend connectivity (Postgres, Mongo, Elasticsearch, Cassandra side-by-side), or commercial support from Meta. Presto is slightly slower than Trino on pure OLAP but adds connectors for heterogeneous data. Cost: self-hosted (free) or Presto SaaS (commercial).
Use Spark SQL when: You build data pipelines, ETL jobs, ML feature engineering, or batch transformations. Spark SQL excels at joining a Parquet lake with a Postgres table, writing the result back to Iceberg, and triggering downstream ML training. Spark’s ecosystem (MLlib, Spark Streaming, GraphX) is unmatched. Cost model: cluster resources (EC2, Kubernetes pods) billed per hour; Databricks adds 0.4–1.0 credits per DBU.
Hybrid approach: Use Trino for interactive analytics and BI, Spark for batch pipelines. Trino ingests fresh data from Iceberg tables populated by Spark jobs. This decoupling gives you the best of both worlds.
Performance: benchmarks and caveats
Industry benchmarks (TPC-H, TPC-DS) show Trino executing 100-node queries 30–50% faster than Spark SQL, especially on OLAP workloads with large scans and aggregations. However, benchmarks are narrow: they don’t measure real-world costs (cluster elasticity, data freshness, concurrency), don’t test complex joins or window functions, and don’t account for startup time (Spark is faster on warm clusters).
TPC-H 10TB on 100-node cluster:
– Trino: 45–60 seconds (geometric mean, single query)
– Spark (3.x): 90–120 seconds
– Spark with Photon (GPU): 50–70 seconds
– Presto: 70–100 seconds
TPC-DS 100TB (subset):
– Trino: 200–300 seconds (99 queries)
– Spark (3.x): 400–500 seconds
– Spark with Photon: 250–350 seconds
Caveats:
1. Benchmarks run on warm clusters with data in memory caches. Cold-start (cluster spin-up, data fetch from S3) adds 10–30 seconds.
2. Trino’s results are returned to the client; Spark often writes results to disk (hidden latency).
3. Concurrency isn’t measured. Trino handles 100 queries simultaneously; Spark struggles with >10 concurrent jobs on the same cluster.
4. The benchmarks assume perfect data layout (Parquet, well-partitioned). Real-world data is messier.
Acceleration ecosystems: Photon, Velox, C++
2026 sees a move toward GPU and specialized CPU acceleration.
Databricks Photon (Spark 3.0+): Compiles Spark SQL to C++, runs on GPUs or specialized CPUs, and claims 2–5x speedup on OLAP queries. Photon is available in Databricks Premium and works with Iceberg. Cost: 2x cluster cost + Databricks premium tier (~1.5 credits/DBU).
Velox (Meta/Presto): A new vectorized compute engine powering Presto C++ (not yet released as of April 2026). Velox promises Spark-like expressiveness with Trino-like speed. It’s a bet that Presto can reclaim performance leadership.
Spark 4.0 ANSI SQL mode: Adds stricter SQL semantics, better optimizer rules, and native support for Iceberg column statistics. Spark 4.0 (expected mid-2026) will close some of the performance gap with Trino on standard OLAP workloads.
The trend is clear: pure JVM execution (classic Spark, classic Presto) is reaching ceiling. The future is compiled engines (Photon, Velox, DuckDB-style) that trade startup time for sub-second query performance.
Trade-offs, gotchas, and what goes wrong
All three engines have blind spots.
Trino:
– No native support for complex nested transformations (e.g., explode + multiple aggregations on the same nested column).
– Weak on data mutation (INSERT/UPDATE/DELETE) compared to Spark.
– Single coordinator is a bottleneck for very large metadata (billions of partitions). Starburst’s Ranger integration helps but adds cost.
– Memory overhead: Trino buffers results on the coordinator before returning to the client, risking OOM on wide result sets.
– No built-in ML integration (unlike Spark).
PrestoDB:
– Slower than Trino on pure OLAP because of less aggressive optimization.
– Connector ecosystem is broader but less tested; Mongo and Cassandra connectors have lower SLAs than Iceberg.
– Limited Iceberg metadata statistics pushdown, leading to suboptimal join orders.
Spark SQL:
– Driver is a single point of failure; if it crashes, the job is lost. (Databricks mitigates this with driver-side checkpointing.)
– Memory management is opaque; shuffles can blow up memory if data is skewed.
– Cold-start latency (2–5 seconds) makes Spark ill-suited for sub-second interactive queries.
– Tungsten code generation can fail silently on complex expressions; fallback is slow.
– Streaming (Structured Streaming) is mature but second-class citizen compared to batch.

All three engines:
– Skewed data destroys performance. A join where one side has 1M rows on one partition and 1B rows on another will serialize.
– Metadata staleness (old table statistics) leads to terrible plans. Iceberg’s statistics help but require regular ANALYZE TABLE.
– Network is often the bottleneck, especially for cross-region queries. Locality-aware scheduling helps but isn’t perfect.
– Cold S3 buckets (no CloudFront caching) can add 100ms+ per request. Warming caches before queries is an underrated optimization.
Practical recommendations
For your team:
-
Profile your workload. Measure query latencies, concurrency, and data volumes. If 90% of queries are <30 seconds and you have >50 concurrent users, Trino is the right choice. If you’re building ETL pipelines that transform 10+ tables, Spark wins.
-
Start with Trino for interactive analytics. It has the lowest operational overhead and the fastest ROI. Pair it with an object storage (S3) and Iceberg tables. Use Starburst Galaxy to avoid managing a cluster.
-
Add Spark for batch and ML. Once you have Iceberg tables populated by Spark, Trino can query them interactively. This decoupling keeps resource usage predictable.
-
Invest in monitoring. For Trino, track query latency percentiles and worker utilization. For Spark, track DAG stage times and shuffle metrics. For all engines, monitor metadata freshness (Iceberg statistics staleness).
-
Plan for specialized hardware. By Q4 2026, GPU-accelerated Spark (Photon) and Presto C++ (Velox-based) will be mature. If you’re building a new data stack, assume you’ll want GPU nodes in 12 months.
Checklist:
– [ ] Classify your queries: interactive (<10s), batch (10s–10m), long-running (>10m).
– [ ] Measure concurrency: How many queries run simultaneously? Does your BI tool queue queries?
– [ ] Assess data freshness requirements: Real-time (refreshed every minute)? Hourly? Daily?
– [ ] Evaluate catalog maturity: Do you have Iceberg partitions and statistics, or raw Parquet?
– [ ] Plan cloud costs: Trino (interactive) is cheaper for ad-hoc; Spark is cheaper for pre-planned, high-volume batch.
Frequently asked questions
Can Trino replace Spark completely?
No. Trino excels at querying existing tables but lacks Spark’s ability to perform complex multi-step transformations (e.g., explode, nested aggregations, ML feature engineering). Trino also can’t write to Iceberg with the same consistency guarantees as Spark. Use Trino for reads and analytics; use Spark for writes and transformations.
What’s the difference between Presto and Trino?
Presto is now the Meta-maintained version (open-source). Trino is the fork maintained by the original creators and is 6–12 months ahead on features. Trino is faster on OLAP queries. Presto is better if you need connectors to non-SQL systems (Mongo, Elasticsearch). For new projects, choose Trino.
Does Iceberg matter for query engine choice?
Yes. Iceberg’s time-travel, schema evolution, and atomic writes reduce the need for complex ETL logic in your query engine. All three engines support Iceberg, but Trino has the tightest integration. If you’re standardizing on Iceberg, Trino is the natural choice for interactive analytics.
How much does it cost to run Trino vs Spark?
Trino: $0.50–$2.00 per query on Starburst Galaxy (depends on data scanned). Spark on Databricks: $0.50–$5.00 per job (depends on cluster size and duration). For 100 daily ad-hoc queries, Trino costs $50–$200/day. For 100 daily batch jobs, Spark costs $50–$500/day depending on job complexity. The break-even is around 50–100 daily interactive queries; beyond that, Trino is cheaper.
Can I use Spark SQL for interactive queries?
Yes, but it’s not optimized for that use case. Spark SQL queries take 10–30 seconds to start (driver scheduling overhead) and are heavy on memory. For BI dashboards requiring <5-second latency, use Trino. For exploratory queries on notebooks, Spark is fine because startup time is amortized over the notebook session.
Further reading
- Pillar: Cloud DevOps Architecture
- Sibling: Apache Iceberg: Data Lakehouse Production Deep Dive
- Sibling: Apache Kafka Tiered Storage (KIP-405): Architecture and Use Cases
- Sibling: Time-Series Database Internals: InfluxDB, TimescaleDB, QuestDB Compared 2026
- Sibling: PostgreSQL vs YugabyteDB vs CockroachDB: Distributed SQL Compared
- Sibling: Vector Database Benchmarks 2026: Pinecone, Weaviate, Qdrant, Milvus
References
- Trino Documentation — Trino Query Engine — comprehensive docs on query execution, connectors, and Iceberg integration
- Apache Spark Documentation — Spark SQL, DataFrames and Datasets Guide — official reference on Catalyst optimizer, Tungsten, and Photon
- PrestoDB Documentation — Presto: Distributed SQL Query Engine — Meta’s maintained fork with connector details
- Databricks Photon: GPU-Accelerated Analytics — Databricks Blog — performance claims and architecture
- Meta Velox: A Unified Execution Engine — Velox GitHub Repository — next-generation vectorized compute engine powering Presto C++
- Apache Iceberg Documentation — Iceberg Table Format — schema evolution, time travel, and ACID semantics
Last updated: April 22, 2026. Author: Riju (about).
