Cilium Sidecarless Service Mesh: An eBPF Deep-Dive

The Cilium service mesh sidecarless architecture answers a question that has nagged platform teams for half a decade: why does every pod need its own proxy container just to get encryption and traffic policy? Cilium’s answer is to push that work into the Linux kernel using eBPF, eliminating the per-pod sidecar entirely. By June 2026, this eBPF service mesh is no longer a fringe bet. Cilium graduated from the CNCF in October 2023 and has since become a default networking layer for serious Kubernetes deployments, with native mTLS shipping in March 2026.

This deep-dive is for engineers who already run Kubernetes and want to understand the datapath, not just the marketing. Here is what this article covers: the sidecar problem and why eBPF helps, exactly how the sidecarless datapath, identity, and L7 policy work, a hands-on Cilium tutorial with working YAML and CLI, an honest comparison with Istio’s ambient mode, the trade-offs that bite in production, and a practical adoption checklist. Every benchmark claim is attributed to a named source, and every config snippet is clearly marked so you know what is illustrative versus copy-paste-ready.

Context: the sidecar problem and why eBPF

The sidecar model — popularised by Istio and Linkerd — injects a proxy container, usually Envoy, into every application pod. That proxy intercepts all inbound and outbound traffic, terminates mTLS, and enforces L7 policy. It works, and it works well enough that it defined what a service mesh meant for years. But the tax is real. Every request between two services traverses two extra userspace proxies: the source pod’s sidecar and the destination pod’s sidecar. On a cluster with thousands of pods, you are running thousands of Envoy instances, each consuming memory and CPU even when idle, and each adding latency hops on the request path.

The cost was never only CPU. It was also operational. Sidecar injection mutates your pods through an admission webhook, which complicates startup ordering — your app container can race the proxy and fail its first connections before the mesh is ready. A proxy version upgrade becomes a fleet-wide pod restart, because the proxy lives inside the workload pod. Init containers, jobs, and anything that exits cleanly all need special handling so the sidecar does not keep the pod alive. None of these are dealbreakers individually, but together they are the friction that made teams ask whether there was a better shape for the problem.

The eBPF service mesh pitch is to do the same work without the extra processes. As TechTarget reported when the sidecarless approach first sparked debate, Cilium handles connection management, load balancing, and policy enforcement with eBPF programs running directly in the kernel, reserving a shared proxy only for the cases that genuinely need application-layer parsing. The Cilium project’s own framing is that eBPF lets it revisit the original promise of node-local simplicity while still delivering per-pod context and control — identity and policy stay precise, but the per-pod process disappears.

This is the crux of the architectural argument. eBPF is a kernel technology that lets you attach small, verified programs to hooks throughout the networking stack. Because those programs run in the kernel as packets flow through, Cilium can make routing, load-balancing, and policy decisions without ever copying the packet up into a userspace proxy. The sidecar’s job has not vanished — it has moved down a layer, into a place where it is cheaper to execute. That is the entire thesis of the sidecarless mesh, and the rest of this article is about whether the thesis survives contact with production.

How Cilium’s sidecarless mesh works

The headline simplification is that there is no proxy inside your application pod. Instead, a single Cilium agent runs as a DaemonSet on every node, programs eBPF into the kernel, manages identity, and stands up a shared node-local Envoy only when an L7 rule actually needs it. The figure below shows the moving parts on a single node: the agent, the kernel eBPF programs, the optional Envoy, the identity allocator backed by SPIRE, the Cilium operator that coordinates cluster-wide state, and Hubble for observability.

Understanding these components is the difference between operating Cilium confidently and treating it as a black box. The agent is the workhorse: it watches the Kubernetes API, translates your policies and services into eBPF maps and programs, and reconciles the kernel state on every node. The operator handles cluster-scoped concerns like IP address management and garbage collection. Hubble taps into the same eBPF datapath the agent programs, which is why its observability is essentially free — it reads the flow data the kernel is already producing rather than re-parsing traffic in a separate pipeline.

The eBPF datapath

eBPF programs attach to kernel hooks — the traffic-control (TC) ingress and egress hooks, XDP at the driver level, and socket hooks at the syscall boundary — and run as the packet moves through the network stack. For a Cilium service mesh sidecarless deployment, this is where L4 routing, service load balancing, NAT, and connection tracking happen. Because the logic executes inside the kernel, packets between two pods on the same node can be short-circuited without ever climbing into userspace, and even cross-node traffic avoids the proxy hops a sidecar mesh would impose.

The practical consequence is fewer context switches and fewer memory copies. Cilium’s own L7 documentation confirms that purely L3/L4 traffic never touches a proxy at all; the eBPF datapath decides forward-or-drop in the kernel. That is the source of Cilium’s frequently cited latency advantage for L4-only workloads. It also explains why Cilium can replace kube-proxy outright: the same eBPF machinery that enforces policy also implements Kubernetes service load balancing, so you can delete kube-proxy and its iptables rules entirely and let eBPF maps do the routing.

This kernel-resident datapath is also why Cilium blurs the line between CNI and service mesh. In the sidecar world, the CNI plugin and the mesh are separate layers stacked on top of each other. In Cilium they are the same eBPF datapath wearing two hats. That unification is a genuine operational simplification — one component to install, upgrade, and observe — but it also means your CNI choice and your mesh choice become the same decision, which is something to weigh carefully if you are already committed to a particular CNI.

mTLS and identity

Identity is the foundation of everything Cilium does for security, and it is worth getting right in your head. Cilium does not trust IP addresses, because in Kubernetes an IP is ephemeral and reusable — the pod behind it can change in seconds. Instead, every endpoint is assigned a cryptographic identity derived from its labels. Policy is written against those identities, not addresses, which is what makes Cilium policy stable as pods churn.

For mutual authentication, Cilium has long used SPIFFE identities backed by a SPIRE server. The earlier design performed an out-of-band mTLS handshake between Cilium agents to establish that two endpoints were who they claimed to be, after which eBPF — or WireGuard / IPsec for the encryption itself — carried the traffic. The separation is deliberate: the handshake proves identity, and a separate, fast transport encryption protects the bytes.

In March 2026 the project announced native mTLS, folding a ztunnel-style data path into Cilium so that encryption and decryption can be handled transparently by the kernel datapath and node-level agents. The stated goal is fully transparent, performant encryption where applications never need to be mTLS-aware — no certificates to mount, no client libraries to integrate, no code changes. It is worth being sober about this, though. The New Stack has documented edge cases where the earlier out-of-band mutual-auth design could be bypassed under specific conditions. The lesson is not that Cilium mTLS is broken, but that you must validate your threat model against the specific version you deploy rather than assuming any mesh’s encryption is bulletproof by default.

L7 policy

When you need HTTP, Kafka, or DNS-aware rules, eBPF alone is not enough — somebody has to parse the application protocol, and parsing HTTP is not something you want to do in a verified kernel program. Cilium’s answer is the part that surprises people: it does not inject a sidecar. Instead, as Cilium’s L7 policy blog explains, it transparently redirects only the matching traffic to a node-local Envoy proxy that is shared by every pod on that node. The Envoy can be deployed as its own DaemonSet or embedded directly in the agent pod.

This is the architectural compromise that makes the whole model coherent. Keep the cheap, common L4 path in the kernel where it is fast, and pay for a userspace proxy only on the specific flows that genuinely require deep inspection. A cluster where most traffic is plain TCP with a handful of HTTP policies gets the best of both worlds: kernel-speed for the bulk, full L7 enforcement for the flows that need it. The cost model is per-node and per-policy rather than per-pod, which is a dramatically smaller multiplier on a large cluster. The trade-off, which we will return to, is that the node-local Envoy becomes a shared dependency — when it is embedded in the agent, the availability of L7-policed traffic is tied to the availability of the Cilium agent pod on that node.

Hands-on setup

This section is a compact Cilium tutorial. The snippets below are realistic and clearly marked. Test them on a non-production cluster first, and pin to a version you have validated rather than tracking latest.

Install Cilium with the mesh-relevant features enabled using the Cilium CLI:

# Working example — Cilium CLI install with mesh features
cilium install \
  --version 1.18.0 \
  --set kubeProxyReplacement=true \
  --set l7Proxy=true \
  --set encryption.enabled=true \
  --set encryption.type=wireguard \
  --set authentication.mutual.spire.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

cilium status --wait

A quick tour of those flags. kubeProxyReplacement=true lets eBPF replace kube-proxy entirely, so you can remove the kube-proxy DaemonSet and its iptables churn. l7Proxy=true enables the node-local Envoy that backs L7 rules. encryption.type=wireguard turns on transparent transport encryption — WireGuard is the simpler operational choice for most teams, while IPsec is available if you have compliance reasons to prefer it. The SPIRE flag wires up SPIFFE-backed identity for mutual authentication, and the Hubble flags give you observability from the first minute.

Confirm the install landed before going further:

# Confirm datapath features are active
cilium status
cilium config view | grep -E "kube-proxy|l7|encrypt"

Now apply an L7 policy. The example below allows pods labelled app=frontend to call only GET /api/products on the backend service, dropping every other method and path. Note that this is an egress policy on the frontend, expressed against the backend’s identity rather than its IP:

# Working example — CiliumNetworkPolicy with HTTP L7 rules
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: frontend-to-backend-l7
  namespace: shop
spec:
  endpointSelector:
    matchLabels:
      app: frontend
  egress:
    - toEndpoints:
        - matchLabels:
            app: backend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: "GET"
                path: "/api/products"

The moment this policy is applied, Cilium recognises that an HTTP rule is in play and begins redirecting matching traffic on port 8080 through the node-local Envoy, which authorizes each request against the method and path before letting it reach the backend. Traffic that does not match — a POST, or a request to /api/admin — is dropped at the proxy, and Hubble records the denial.

To require mutual authentication on that same flow, add an authentication block so the connection is only permitted after a successful mTLS handshake between the two endpoints’ identities:

# Working example — require mutual authentication on the egress rule
      authentication:
        mode: "required"

Verify what the datapath actually does with Hubble, which reads directly from eBPF rather than from a separate logging sidecar:

# Observe L7-aware flows in real time
cilium hubble port-forward &
hubble observe --namespace shop --protocol http --follow

# Watch policy verdicts, including drops
hubble observe --namespace shop --verdict DROPPED --follow

When an L7 policy applies, the request takes a detour through the node-local Envoy before being reinjected into the eBPF datapath; when no L7 rule matches, it stays entirely in the kernel. The figure below traces both paths so you can see exactly where the proxy enters the picture.

A useful habit during rollout is to deploy policies in a permissive posture first and watch Hubble for the drops you would have caused, then flip to enforcing once the flows look right. This avoids the classic failure mode of shipping a too-tight policy and discovering the blast radius in production.

Cilium vs Istio ambient

The honest comparison is Cilium mesh vs Istio ambient mode, because both removed the per-pod sidecar — they simply did it differently, and the difference is instructive. Istio’s ambient mode, as Istio’s own engineering blog describes, splits the work into two layers: a per-node ztunnel DaemonSet that handles L4 and mTLS as a zero-trust tunnel, and optional per-namespace waypoint Envoy proxies for advanced L7 traffic management. Cilium pushes L4 and encryption into the kernel via eBPF and WireGuard, and uses a node-local Envoy for L7. The single most important distinction is where the L4 mTLS path lives: in userspace ztunnel for Istio, in the kernel for Cilium.

Performance results are nuanced and depend heavily on the workload, which is exactly why you should distrust any single chart — including the ones below. Istio’s own published benchmark found Cilium fastest with default parameters, because eBPF runs in kernel space and avoids the userspace hop. But the same benchmark observed that Cilium slowed substantially once L7 policy and encryption were both enabled, and that Cilium’s CPU and memory stayed high even when no traffic was flowing through the mesh. Istio ambient, by contrast, showed steadier throughput under encryption and settled to a fraction of Cilium’s CPU when idle. These are Istio’s numbers about a competitor, so read them with appropriate skepticism — but the shape of the result, that kernel-speed favours simple L4 while a userspace design can be steadier under heavy L7, is a reasonable mental model to carry into your own testing.

Dimension	Cilium sidecarless mesh	Istio ambient mode
L4 mTLS location	Kernel eBPF plus WireGuard or IPsec	Userspace ztunnel DaemonSet
L7 proxy	Node-local Envoy when needed	Per-namespace waypoint Envoy
CNI relationship	Mesh and CNI are one eBPF stack	Mesh layered on any CNI
Idle resource use	Higher in Istio’s benchmark	Lower in Istio’s benchmark
L4-only latency	Lowest per Istio’s benchmark	Slightly higher
Kernel dependency	High — needs a modern kernel	Lower — more portable
L7 feature richness	Growing, still catching up	Mature Envoy ecosystem
Best fit	Mostly L4 and DNS, unified stack	Heavy, stable L7 traffic shaping

For pure L3/L4 use cases without encryption, Istio’s analysis concluded that Cilium is a cost-effective and performant solution. For larger clusters that prioritise stability, scalability, and advanced L7 features, the same analysis pointed toward ambient mode paired with an appropriate network-policy implementation. Crucially, several 2026 practitioners are no longer treating this as an either-or. A published EKS egress comparison recommends a hybrid: Cilium for performant L3/L4/DNS-based egress policy, with Istio ambient layered on top where deep L7 traffic management is genuinely required. That hybrid acknowledges the reality that Cilium’s strength is the kernel datapath and Istio’s strength is its mature L7 feature set, and that you can have both.

Trade-offs and what goes wrong

The kernel is Cilium’s superpower and its constraint, and you cannot adopt the sidecarless mesh responsibly without internalising that. Because the datapath is eBPF, you need a reasonably modern Linux kernel. Older nodes, certain locked-down managed offerings, or exotic kernel builds can limit which features you actually get. The failure mode here is subtle: a heterogeneous fleet can leave some node pools without encryption or full kube-proxy replacement while others have them, producing inconsistent behaviour that is maddening to debug. Confirm kernel versions across every node pool before committing, and treat kernel version as a first-class part of your node-pool definition.

Debugging eBPF is genuinely harder than reading a sidecar’s access log, and this is the trade-off teams most underestimate. When traffic is dropped in the kernel, there is no Envoy log line to grep. You reach instead for cilium monitor, Hubble flow logs, and occasionally bpftool to inspect the eBPF maps directly. The tooling is good and improving quickly — Hubble in particular is excellent — but the mental model is unfamiliar to engineers who grew up debugging userspace proxies. Budget time for the team to build that muscle, and make sure more than one person on the team can drive these tools before you depend on the mesh in production.

There are also feature gaps relative to a mature L7 mesh. Advanced traffic-management primitives — fine-grained retries, fault injection, weighted canary trees, complex header-based routing — are still richer in Envoy-centric meshes that have spent years building them out. Cilium is closing the gap and the Gateway API integration helps, but if your application platform leans heavily on sophisticated L7 routing, audit your specific requirements against the current Cilium version rather than assuming parity. The idle-resource behaviour flagged in Istio’s benchmark is worth profiling on your own hardware, because it directly affects your cost at scale. And the mTLS caveat from earlier bears repeating: read the release notes for the specific version’s authentication mode rather than assuming the defaults are airtight.

Finally, consider the shared-dependency shape of the node-local Envoy. Because L7 enforcement is per-node rather than per-pod, an issue with the agent or the embedded Envoy on a node affects every L7-policed flow on that node at once. That is usually a fine trade — one shared, well-monitored proxy is easier to reason about than thousands of sidecars — but it changes your failure domain, and you should test what happens when that node-local proxy is unhealthy so the answer is not a surprise during an incident.

Practical recommendations and checklist

Adopt the sidecarless mesh deliberately, not because it is fashionable. The technology is genuinely strong, but it rewards teams who understand the kernel dependency and the L7 cost model, and it punishes teams who treat it as a drop-in sidecar replacement. Use this checklist before you roll it out:

Confirm a supported, consistent kernel version across all node pools, and pin it.
Classify your traffic: mostly L4 and DNS leans favourable for Cilium; L7-heavy traffic should make you seriously evaluate Istio ambient or a hybrid.
Run your own latency and idle-resource benchmark on representative hardware; do not trust a vendor chart, including the ones cited here.
Pin an exact Cilium version and read its mTLS and authentication release notes before enabling encryption.
Enable Hubble from day one so you have observability before you have an incident, not after.
Start every L7 policy in an audit or observe posture, then tighten to enforcing once Hubble confirms the flows.
Test a node-local Envoy or agent failure deliberately to understand the blast radius of L7 traffic on that node.
Make sure at least two engineers can drive cilium monitor, Hubble, and bpftool before you depend on the mesh.
Document a concrete rollback path back to your previous CNI or mesh, and rehearse it.

If you are building toward zero trust, pair the mesh with a broader strategy — our zero-trust network architecture implementation guide for 2026 covers identity, segmentation, and policy beyond the datapath, and our Kubernetes network policy egress and RDS guide is a useful companion for locking down egress traffic to managed data stores.

FAQ

Is Cilium service mesh truly sidecarless?
Yes for L3, L4, and encryption, which run entirely in the kernel via eBPF with no per-pod proxy. For L7 rules it uses a shared node-local Envoy, but that is one proxy per node — not one per pod — so there is genuinely no per-pod sidecar in your application pods.

Does Cilium support mTLS without sidecars?
Yes. Cilium uses SPIFFE identities backed by SPIRE for mutual authentication and, as of the March 2026 native mTLS release, can handle transparent encryption through the kernel datapath and node-level agents so applications stay completely mTLS-unaware — no certificates to mount and no client-library changes.

Cilium mesh vs Istio — which is faster?
Istio’s own benchmark found Cilium fastest for L4-only traffic because eBPF runs in kernel space, but noted that Cilium slows under combined L7 and encryption and consumes more idle CPU. The honest answer is that it depends on your workload, so benchmark your own traffic before deciding.

Do I need a special kernel for Cilium’s mesh?
You need a reasonably modern Linux kernel. Verify kernel versions across every node pool, since features like full kube-proxy replacement and transparent encryption depend on kernel support, and an inconsistent fleet causes hard-to-debug behaviour.

Can I run Cilium and Istio together?
Yes, and many 2026 teams do exactly this — using Cilium as the CNI for fast L3, L4, and DNS-based policy, and layering Istio ambient on top for advanced L7 management where it is genuinely needed. The hybrid plays to each project’s strengths.

Cilium Sidecarless Service Mesh: An eBPF Deep-Dive

Cilium Sidecarless Service Mesh: An eBPF Deep-Dive

Context: the sidecar problem and why eBPF

How Cilium’s sidecarless mesh works

The eBPF datapath

mTLS and identity

L7 policy

Hands-on setup

Cilium vs Istio ambient

Trade-offs and what goes wrong

Practical recommendations and checklist

FAQ

Further Reading

Related

Comments

Leave a Reply Cancel reply

Tag Cloud

Categories