Zenoh vs MQTT vs DDS: Robotics Middleware Compared (2026)

Zenoh vs MQTT vs DDS: Robotics Middleware Compared (2026)

Zenoh vs MQTT vs DDS: Robotics Middleware Compared for 2026

Pick the wrong pub/sub layer and you discover it the hard way: a robot fleet that floods the network with discovery traffic the moment you scale past a dozen nodes, or a cloud telemetry pipeline that chokes the first time a forklift drives behind a steel rack and the Wi-Fi link goes lossy. The zenoh vs mqtt vs dds decision is not a feature checklist — it is a bet on a connectivity model, a discovery strategy, and a QoS philosophy that you will live with for the life of the platform. These three protocols solve overlapping problems from genuinely different starting points: MQTT centralizes on a broker, DDS distributes into a peer mesh, and Zenoh blends peer, brokered, and routed modes into one fabric. That difference shows up everywhere — in tail latency, in how badly things degrade over a WAN, and in whether ROS 2 stays responsive at fleet scale.

What this covers: connectivity and discovery models, transports, a side-by-side decision matrix, latency and scaling trade-offs, security, and concrete guidance on which protocol to choose for robotics and edge IoT.

Context and Background

The three protocols come from different decades and different design committees, and that lineage explains most of their behavior. MQTT is the oldest of the trio in spirit. It was created in 1999 for telemetry over expensive, low-bandwidth satellite links, and it has always been a broker-centric publish/subscribe protocol: clients connect to a central broker, publish to topic strings, and subscribe to topic filters. The current standard, OASIS MQTT 5.0, added shared subscriptions, message expiry, and richer reason codes, but the model stays the same — everything routes through the broker. That simplicity is exactly why MQTT dominates cloud-facing IoT telemetry.

DDS (Data Distribution Service) comes from the Object Management Group’s DDS specification and is data-centric rather than message-centric. Instead of opaque payloads on topic strings, DDS models typed data objects with keys, and its wire protocol RTPS lets participants discover each other peer-to-peer with no broker at all. It carries a famously rich QoS catalog — deadlines, liveliness, durability, ownership — built for real-time and mission-critical systems in defense, aerospace, and industrial automation. DDS became the default middleware for ROS 2 precisely because robots need that determinism.

Zenoh (Zero Network Overhead) is the newcomer, born from Eclipse and the same researchers who shaped DDS thinking. It unifies three primitives — publish/subscribe, distributed query, and storage — over one protocol that runs in peer, brokered (router), or routed mesh topologies. The Eclipse Zenoh documentation frames it as a fabric that scales from microcontrollers to the data center. Crucially, ROS 2 now ships an rmw_zenoh middleware layer, making it a first-class alternative to DDS under the same ROS API. Understanding the OPC UA vs MQTT Sparkplug B trade-offs gives useful context for how data-modeling choices ripple through an industrial stack.

It helps to be precise about the abstraction each protocol exposes, because that shapes everything you build on top. MQTT gives you a flat namespace of topic strings and opaque byte payloads; the protocol neither knows nor cares what is inside a message, which keeps it simple but pushes all schema and type discipline up into your application. DDS hands you typed, keyed data objects and the concept of a global data space — readers and writers do not exchange messages so much as share a continuously updated view of named data instances, with the middleware tracking the latest value per key. Zenoh’s primitive is the key expression, a hierarchical, wildcard-capable resource name that a peer can publish to, subscribe to, or query, and that a storage can persist. That key-expression model is what lets Zenoh fold pub/sub, query, and storage into one API instead of three. For a robotics architect, the practical question is whether you want the middleware to enforce types and track state (DDS), stay out of the way (MQTT), or give you a queryable named-resource fabric (Zenoh) — and that choice is far more consequential than any single latency figure.

Architecture and Data Flow Compared

Zenoh vs MQTT vs DDS architecture comparison

Figure 1: Three connectivity topologies — MQTT routes every message through a central broker, DDS forms a brokerless peer mesh, and Zenoh combines peer links with optional routers for WAN reach.

The single most important architectural fact is connectivity model. MQTT is brokered: no broker, no messaging. DDS is brokerless and peer-to-peer: participants talk directly. Zenoh is hybrid: peers can talk directly on a LAN, and routers stitch peers together across subnets and WANs. Everything else — discovery cost, failure modes, WAN behavior — flows from this one choice.

Connectivity: brokered vs brokerless vs routed

In MQTT, every publisher sends to the broker and every subscriber receives from the broker. This star topology is trivial to reason about and trivial to secure: you defend one box, you authenticate against one box, and a packet capture at the broker shows you the entire system. The cost is an extra network hop on every message and a hard dependency on a single component. A publisher in the same room as its subscriber still routes through a broker that might be in another data center.

DDS removes the broker entirely. Participants form a peer mesh where each writer sends samples directly to each matched reader. On a low-latency LAN this is ideal: there is no intermediary, and a sample travels one hop from producer to consumer. The price is that every participant must know about every other relevant participant, which is where discovery cost explodes at scale, and that there is no natural choke point for security or observability — you secure N participants, not one broker.

Zenoh deliberately refuses to pick a side. On a LAN, two Zenoh peers discover each other and exchange data directly, behaving like DDS. When you need to cross subnets, traverse NAT, or reach a cloud, you insert a Zenoh router, and peers connect through it like an MQTT broker — except the router is optional and you can run many of them in a mesh. This is the key reason Zenoh suits robot fleets that live partly on-robot, partly at an edge gateway, and partly in the cloud: one protocol spans all three tiers without a protocol bridge.

This hybrid stance has a concrete operational payoff that is easy to underestimate until you run a fleet. With a pure peer model, adding the cloud means either exposing every robot’s peer to the internet (a security and NAT nightmare) or bolting on a protocol bridge that translates DDS to something WAN-friendly — and that bridge becomes a brittle, stateful component you now have to operate. With a pure broker model, two robots in the same cell that need a tight loop are forced to round-trip through a broker that may be far away. Zenoh lets each link pick the topology that fits: direct peer where latency matters, routed where reach matters, and the routing layer reconfigures as robots roam between cells without the application code changing. The cost is conceptual — engineers must understand when a router is in the path and when it is not — but it removes the bridge component that so often becomes the weakest link in a multi-tier robotics deployment.

Discovery: where the protocols diverge most

MQTT has no peer discovery at all. Clients are configured with the broker’s address, and topic subscriptions are matched at the broker. This is a feature, not a gap: zero discovery traffic means MQTT scales to enormous client counts without any participants needing to know about each other. The trade is that the broker becomes the system’s memory and matching engine, so its capacity, not the network’s, sets the scaling ceiling. A managed broker cluster can hold hundreds of thousands of connections, which is exactly why MQTT underpins consumer IoT at internet scale and why it carries no discovery storm regardless of client count.

DDS pays for its brokerless freedom with discovery overhead. The Simple Participant Discovery Protocol (SPDP) announces each participant, typically over multicast, and then the Simple Endpoint Discovery Protocol (SEDP) exchanges the full list of readers and writers so participants can match compatible endpoints by topic and QoS. The traffic and memory cost of this handshake grows roughly with the square of the participant count, because every participant must learn about every other one. This DDS discovery overhead is the single most common reason large ROS 2 fleets stall — the discovery storm can saturate links before any application data flows. Vendors mitigate it with discovery servers and static discovery, but the underlying N-to-N matching remains.

Zenoh takes a middle path called scouting plus gossip. Peers scout for each other (often via multicast on a LAN), and routers gossip subscription and query state so that the network only propagates the routing information that is actually needed. The result is discovery that scales closer to the number of distinct resources than to the square of the participant count, which is why Zenoh tends to stay calm where DDS gets loud as node counts climb. The mechanism matters: because routers aggregate and summarize subscription interest, a peer joining a large network does not have to learn every other peer’s endpoints — it learns the routing toward the resources it actually cares about. That aggregation is the structural difference from DDS, where the absence of any aggregation point forces full N-to-N endpoint knowledge. In practice this is the difference between a discovery cost you can bound by your number of distinct topics and one that grows with the square of how many processes you run.

Transports: TCP, UDP, shared memory, and QUIC

MQTT runs over TCP, with MQTT-SN and other variants for UDP-style and constrained links, plus WebSockets for browser reach. The reliability comes from TCP plus MQTT’s own QoS 0/1/2 acknowledgment levels.

DDS over RTPS is UDP-first, using multicast for discovery and for fan-out to many readers, which is efficient on a wired LAN but problematic on Wi-Fi where multicast is unreliable and slow. High-end DDS implementations add a shared-memory transport so two processes on the same machine skip the network stack entirely — a big win for on-robot inter-process communication of camera and lidar frames. That shared-memory path is genuinely important for robotics: a depth camera producing hundreds of megabytes per second to a perception node on the same compute module cannot afford to serialize and copy through the kernel network stack, so zero-copy shared memory is the difference between a viable and an unviable on-robot pipeline. Both DDS and Zenoh offer this; MQTT, being broker-mediated, does not, which is one more reason it lives at the telemetry layer rather than the sensor-processing layer.

Zenoh supports TCP, UDP, shared memory for intra-host zero-copy, and QUIC for efficient, encrypted, multiplexed transport over WANs and lossy links. That QUIC path is a deliberate answer to the exact scenario where DDS multicast falls apart: a mobile robot on flaky wireless that needs to reach a cloud router. The transport flexibility is a large part of why Zenoh markets itself for edge and constrained deployments, and it connects naturally to a broader unified namespace architecture for industrial IoT where one fabric must span shop floor and cloud.

Latency, QoS, and Scale: Where Each Wins

DDS discovery is expensive because every participant must learn every other participant’s endpoints; the sequence below shows the SPDP-then-SEDP handshake that runs before a single byte of user data moves.

DDS discovery handshake sequence

Figure 2: The DDS discovery handshake — SPDP announces participants over multicast, then SEDP exchanges and matches reader/writer endpoints before user data can flow.

The handshake in Figure 2 is cheap for two participants and crippling for two hundred. Because SEDP exchanges the full endpoint set, the matching work scales with the product of writers and readers across the system. In a single robot with a few dozen nodes this is invisible. In a warehouse with fifty robots each running fifty nodes, the discovery database on every participant must track thousands of endpoints, and any topology change triggers a fresh wave of matching traffic. This is the scaling wall that pushes large ROS 2 deployments toward discovery servers — or toward Zenoh.

Zenoh routed edge data flow

Figure 3: Zenoh routed data flow — edge peers connect to an edge router, which forwards over QUIC to a cloud router that serves live subscribers and answers historical queries from storage.

Figure 3 shows what Zenoh adds beyond pub/sub: the same fabric carries live data and answers historical queries against a storage backend, so an edge robot, a cloud dashboard, and a time-series store all speak one protocol. That unification is the ROS 2 RMW Zenoh value proposition for distributed fleets — the on-robot graph and the cloud graph are no longer separated by a protocol bridge.

Here is the decision matrix that captures the structural differences. Treat any latency characterization as illustrative and order-of-magnitude; real numbers depend heavily on payload size, transport, hardware, and QoS settings, so benchmark your own workload before committing.

Dimension Zenoh MQTT DDS
Connectivity Peer plus optional routers (hybrid) Brokered star (centralized) Brokerless peer mesh
Discovery Scouting plus gossip, scales by resource None — broker matches topics SPDP/SEDP, scales by participant squared
QoS Reliability, congestion control, priorities QoS 0/1/2 delivery levels Rich catalog — deadline, liveliness, durability, ownership
Transport TCP, UDP, shared memory, QUIC TCP (MQTT-SN/UDP variants), WebSocket UDP/RTPS, multicast, shared memory
WAN/edge fit Strong — routers plus QUIC built for it Good — broker bridges to cloud cleanly Weak — multicast and discovery struggle over WAN
ROS 2 support rmw_zenoh, first-class and growing Via bridges only, not native RMW Default RMW, mature and battle-tested
Footprint Tiny — runs on microcontrollers Small client, broker required Heavier — full DDS stack per participant
Security Access tokens, TLS, link auth TLS plus username/ACL DDS-Security plugins — auth, access control, crypto

A few patterns fall out of the matrix. For raw on-LAN latency between two matched endpoints, DDS and Zenoh are both excellent because both can deliver a sample in a single hop with no broker in the path; MQTT pays an extra hop through the broker. For fan-out to thousands of cloud subscribers, MQTT’s broker model is the natural fit and the one with the most mature managed offerings. For WAN and constrained links — the mobile-robot-on-bad-Wi-Fi case — Zenoh’s routed mode and QUIC transport win clearly, because DDS multicast discovery does not survive crossing subnets and lossy radio gracefully.

QoS is where DDS still leads on expressiveness. Its catalog lets you declare a deadline (samples must arrive every N milliseconds or the reader is notified), liveliness (a writer is considered alive only if it asserts itself on schedule), durability (late-joining readers receive historical samples), and ownership (only the highest-strength writer of a key is delivered). For safety-critical control loops that contract is genuinely valuable. MQTT offers three blunt delivery guarantees — at most once, at least once, exactly once — and Zenoh sits in between with reliability, congestion control, and priority handling, adding more QoS surface as it matures but not yet matching the full DDS catalog.

The footprint difference matters at the edge. A full DDS stack with its discovery database is comparatively heavy; Zenoh’s core is small enough to target microcontrollers via its pico variant, and MQTT’s client is famously tiny while pushing the weight onto the broker. If you are squeezing a protocol onto a constrained sensor node, MQTT-SN or Zenoh-pico are the realistic candidates, not full DDS.

It is worth being concrete about how these structural choices translate into tail latency and jitter, because for robotics the tail is what hurts, not the median. A control loop that usually closes in two milliseconds but occasionally spikes to forty will produce visible motion artifacts or, worse, trip a safety watchdog. MQTT’s broker hop adds a fixed baseline plus whatever queuing the broker introduces under load, and because TCP underlies it, a single lost packet stalls the stream until retransmission — head-of-line blocking that shows up as a latency spike exactly when the network is already stressed. DDS over UDP avoids TCP head-of-line blocking and can deliver with very low jitter on a quiet wired LAN, but its multicast fan-out means one slow or lossy reader can trigger retransmissions that ripple to others, and a mid-stream discovery event can momentarily steal CPU and bandwidth from data flow. Zenoh’s QUIC transport is interesting precisely here: QUIC multiplexes independent streams so a loss on one does not block the others, which keeps tail latency bounded over the lossy wireless links where TCP-based protocols degrade worst. None of this makes one protocol universally fastest; it means the protocol that wins on latency depends entirely on which part of the distribution — median or tail — and which network conditions your robot actually faces.

There is also a subtler scaling axis the matrix hides: how each protocol behaves when the topology churns rather than when it is static. A warehouse fleet is never quiescent — robots power-cycle, dock, hand off between cells, and lose radio contact constantly. In DDS, each of these events is a discovery event that re-runs SEDP matching across affected participants, so a fleet in steady churn pays the discovery cost continuously, not just at boot. MQTT is indifferent to this because clients simply reconnect to the broker and the broker tracks subscriptions centrally. Zenoh’s gossip propagates only the changed routing state, so churn cost scales with what actually changed rather than with total participant count. For a static robot on a bench none of this registers; for a hundred-robot fleet in constant motion, churn behavior can dominate the protocol-selection decision more than steady-state latency does.

Trade-offs, Gotchas, and What Goes Wrong

Robotics middleware selection decision tree

Figure 4: A selection decision tree — route by ROS 2 nativeness, WAN exposure, telemetry simplicity, and whether you need built-in query and storage.

Every one of these protocols has a failure mode that bites teams who skipped the homework. DDS over Wi-Fi is the classic. RTPS leans on multicast for discovery and fan-out, and consumer and even industrial Wi-Fi handle multicast poorly — low data rates, packet loss, and access points that quietly drop or rate-limit multicast groups. Robots that work flawlessly on a wired bench fall apart on the warehouse floor, and the symptom is intermittent discovery failures that are maddening to debug. The fix is usually unicast discovery configuration or a discovery server, which adds operational complexity DDS was supposed to avoid.

MQTT’s gotcha is structural: the broker is a single point of failure and a single point of scale. One broker outage takes down the whole bus, and while clustered and bridged brokers exist, they reintroduce the distributed-systems problems MQTT’s simplicity was meant to dodge. MQTT also has no native concept of typed data or peer-to-peer real-time delivery, so it is poorly suited to tight control loops — it is a telemetry and command bus, not a robotics control fabric.

Zenoh’s main risk is maturity. It is production-grade and advancing fast, but it has a smaller ecosystem, fewer battle-tested deployments, and a thinner pool of engineers than DDS or MQTT. The rmw_zenoh layer is improving release over release but is younger than the DDS RMWs that have shipped robots for years. Betting a safety-critical program on it means accepting that you may hit edges the larger communities have already sanded down.

Security models diverge sharply too. DDS-Security defines pluggable authentication, access control, and cryptographic plugins for fine-grained, per-topic protection — powerful but complex to configure. MQTT relies on TLS for transport plus username/password and broker-side ACLs, simple and well understood but coarse-grained. Zenoh uses TLS, link-level authentication, and access tokens scoped to key expressions. None is strictly better; they target different threat models, and conflating them during a migration is a reliable way to ship an insecure system.

A final gotcha cuts across all three: interoperability is never free, even when bridges exist. A Zenoh-to-DDS bridge faith

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *