Kubernetes Gateway API vs Ingress vs Service Mesh (2026)
Every cluster eventually hits the same wall. The kubernetes gateway api vs ingress question stops being academic the day a platform team discovers that a single shared Ingress resource has accumulated forty vendor-specific annotations, three of which silently conflict, and that nobody can change a timeout without risking an unrelated team’s traffic. Ingress was designed in 2015 for a world of simple HTTP routing. The workloads we run in 2026 — multi-tenant platforms, mTLS-by-default east-west traffic, header-based canary releases, gRPC and TCP backends — long ago outgrew it. This post is an architecture decision record. It lays out the three real options for moving traffic in and across a cluster, the consequences of each, and a defensible recommendation you can paste into your own ADR with the names changed.
What this covers: the role-oriented Gateway API model, where Ingress is still the right call, how service mesh and GAMMA handle east-west traffic, an options matrix, and a concrete Ingress-to-Gateway migration path.
Context and Background
Kubernetes ships with three built-in answers to “how does a request reach a Pod,” and they were designed at different times for different problems. The oldest is the Service object, which gives a stable virtual IP and load-balances across Pods but stops at L4. On top of that sits Ingress, the L7 HTTP router that became the default way to expose web traffic. Ingress works, and for a single team running a handful of hostnames it still works well. The trouble is structural: the Ingress spec deliberately standardised almost nothing beyond host and path matching. Everything interesting — TLS policy, rewrites, header manipulation, canary weights, rate limits, timeouts — lives in controller-specific annotations.
That design produced two chronic failures. The first is annotation sprawl: a production Ingress for a busy platform routinely carries dozens of nginx.ingress.kubernetes.io/* keys whose semantics are documented in a vendor wiki, not the API. Because annotations are free-form strings the API server never validates, a typo in nginx.ingress.kubernetes.io/proxy-body-size is accepted, stored, and silently ignored — the failure surfaces only as a confusing 413 in production. Worse, annotation behaviour is not portable: the same canary intent expressed for ingress-nginx, Traefik, and HAProxy uses three different keys with three different semantics, so a migration between controllers is a rewrite, not a swap.
The second is the one-resource-fits-all problem. A single Ingress object mixes infrastructure concerns (which load balancer, which TLS certificate) with application concerns (which path goes to which service), so a platform team and an application team are forced to edit the same YAML, and RBAC cannot cleanly separate them. Kubernetes RBAC grants verbs on resource kinds, not on fields within a resource, so you cannot say “this team may edit the path rules but not the TLS block” — both live in the same Ingress object. The result in practice is either an over-broad grant that lets app teams change infrastructure, or a bottleneck where every routing change queues behind the platform team.
The Kubernetes Gateway API is the official answer. Its core resources — GatewayClass, Gateway, and HTTPRoute — reached General Availability (the v1 API version) and the project’s stance is explicit: Gateway API is the long-term successor to Ingress, expressive enough to cover the annotation cases as first-class typed fields. Separately, the GAMMA initiative extends the same resources to east-west, service-to-service traffic, which is the territory a service mesh has always owned. Understanding which of these three — Ingress, Gateway API, or mesh — belongs where is the whole game, and it starts with one axis: the direction the traffic flows. For teams running clusters at the edge, the same routing questions interact with constrained hardware, a topic covered in our k3s production edge guide.
The Decision: North-South vs East-West and the Resource Model
The single most useful distinction in cluster networking is direction. North-south traffic enters or leaves the cluster — a browser hitting your API, a webhook from a payment provider. East-west traffic moves between services inside the cluster — the orders service calling payments. Ingress and the Gateway API are north-south tools. A service mesh is fundamentally an east-west tool, and GAMMA is the bridge that lets the Gateway API describe east-west routing too. Pick the wrong tool for the direction and you fight the platform forever.

Figure 1: North-south traffic enters through a Gateway and is dispatched by HTTPRoutes; east-west traffic between services flows through mesh sidecars that enforce mTLS and policy.
Figure 1 separates the two planes. On the north-south plane an external client reaches a Gateway — the cluster’s L7 entry point — which delegates to one or more HTTPRoute objects that hold the host and path rules. Those routes hand off to in-cluster services. On the east-west plane, requests between services pass through mesh data-plane proxies that add mutual TLS, retries, and authorization without the application code knowing. The two planes are independent: you can adopt a Gateway for ingress without any mesh, run a mesh without exposing anything externally, or run both and let GAMMA unify the vocabulary.
The Gateway API resource model and role separation
The Gateway API’s defining idea is that one monolithic object is the wrong abstraction, so it splits routing across three resources owned by different people. A GatewayClass is cluster-scoped and names the controller implementation — it is the analogue of a StorageClass, installed once by whoever runs the platform. A Gateway is the actual listener: it declares ports, protocols, TLS certificates, and which namespaces are allowed to attach routes. An HTTPRoute (or GRPCRoute, TCPRoute, TLSRoute) holds the application-level rules — match this hostname and path, rewrite this header, split traffic eighty-twenty across two backends.
That split maps directly onto organisational roles. The infrastructure team owns the GatewayClass and Gateway; they decide which load balancer fronts the cluster and which certificate terminates TLS. Application teams own their HTTPRoute objects in their own namespaces and can ship routing changes without a ticket to platform. The Gateway controls delegation through its allowedRoutes field, so a platform team can permit specific namespaces to attach while denying others. This is the role separation Ingress could never express, because Ingress fused all of it into one resource.

Figure 2: The infrastructure role owns GatewayClass and Gateway; application teams own their own route objects, attached to the Gateway through allowedRoutes.
Figure 2 shows the ownership boundary. A minimal HTTPRoute is readable and self-documenting in a way an annotation-laden Ingress never is:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: orders
namespace: shop
spec:
parentRefs:
- name: public-gateway
namespace: infra
hostnames:
- "shop.example.com"
rules:
- matches:
- path:
type: PathPrefix
value: /api/orders
backendRefs:
- name: orders
port: 8080
The parentRefs field is the attachment point: this route asks to bind to public-gateway in the infra namespace, and that Gateway’s allowedRoutes policy decides whether to accept it. Traffic splitting is a typed field too — list two backendRefs with weight values and the controller load-balances by weight, no annotation required. A blue-green or canary cut is then a one-field change from weight: 90 / 10 to 50 / 50 to 0 / 100, reviewable in a pull request and reversible with a git revert, which is a categorically safer operation than editing a free-form annotation string.
There is one more mechanism worth understanding: the status and conditions model. When you apply an HTTPRoute, the controller writes back a status.parents[].conditions block reporting whether the route was Accepted and ResolvedRefs (its backends exist and are reachable). A cross-namespace backendRef additionally requires a ReferenceGrant in the target namespace, so a team cannot route to another team’s service without that team’s explicit consent — security is default-deny by design, not by annotation convention. This bidirectional contract — you declare intent, the controller reports reconciled reality — is something Ingress never provided. With Ingress you applied YAML and then read controller logs to find out whether it worked; with the Gateway API the resource itself tells you, which makes the whole system far more amenable to GitOps reconciliation and automated health gates.
Ingress today is still fine for simple cases
None of this means Ingress is dead or that you must rip it out tomorrow. For a single team exposing a few hostnames over HTTPS with cert-manager handling certificates, Ingress is less to learn, has a decade of battle-tested controllers, and every tutorial assumes it. The Gateway API’s value scales with complexity: multi-tenancy, role separation, header-based routing, non-HTTP protocols, and weighted canaries. If you have none of those needs, a stable Ingress is a perfectly defensible status quo, and “do nothing” is a legitimate option in any honest ADR. The mistake is reaching for the next annotation when the requirement has clearly outgrown what annotations can cleanly express.
A useful heuristic is to count the annotations on your busiest Ingress. Zero to two annotations — TLS redirect, body size — and a single owning team is a green light to stay put; the Gateway API would add ceremony without buying you anything. Five or more annotations, or two-plus teams editing the same object, or a standing request for traffic splitting or gRPC, and you have crossed into territory the typed API exists to serve. The decision is not ideological; it is a function of how much your routing has outgrown a string map. It is also worth noting that Ingress and the Gateway API are not mutually exclusive at the cluster level — many platforms run legacy apps on Ingress indefinitely while all new services land on the Gateway API, letting the old resources age out naturally instead of forcing a risky bulk migration.
Service mesh and GAMMA handle east-west
A service mesh — Istio, Linkerd, Cilium’s mesh — solves a different problem: securing and observing traffic between services. It injects a data-plane proxy (a sidecar, or increasingly a per-node or ambient proxy) that transparently adds mutual TLS so service-to-service calls are encrypted and identity-verified, plus retries, timeouts, circuit breaking, and golden-signal metrics, all without touching application code.
The data-plane topology is itself a design choice with real cost implications. In the classic sidecar model every Pod gets its own proxy container, so a 500-Pod service runs 500 extra proxies, each consuming tens of megabytes of memory and a fraction of a CPU, and adding two proxy hops (egress from caller, ingress to callee) to every request — typically a low-single-digit-millisecond latency tax per hop. The newer ambient and per-node models (Istio ambient, Cilium’s eBPF mesh) move the L4 mTLS function to a shared per-node component and reserve full L7 proxies only for workloads that need them, which cuts the proxy count and the idle overhead dramatically. The trade-off is a younger, more complex architecture. Either way, the mesh tax is real and must be budgeted, not assumed away.
Historically each mesh invented its own CRDs for routing — Istio’s VirtualService, Linkerd’s ServiceProfile — so adopting a mesh meant learning a bespoke API. The GAMMA initiative (Gateway API for Mesh Management and Administration) changes that by letting the same HTTPRoute describe east-west routing: you attach the route to a Service instead of a Gateway, and the mesh applies it to in-cluster traffic. One vocabulary now spans both north-south ingress and east-west mesh, which is the strategic reason the Gateway API matters beyond just replacing Ingress. A platform engineer who learns the route model once can reason about both planes, and tooling that lints or generates routes works in both contexts. If your observability story leans on eBPF rather than sidecars, our eBPF observability ADR covers the data-plane trade-offs in depth.
Options, Consequences, and Migration
An ADR is only useful if it states the options it rejected and why. There are three credible paths for cluster traffic, and most real platforms end up combining two of them. Below is each option with its consequences, a comparison matrix, and the migration path that connects them.

Figure 3: A request matches a Gateway listener, then an HTTPRoute’s hostname and path rules, passes through filters, and reaches the backend Pod; a hostname miss returns 404.
Figure 3 traces a single request so the resource model becomes concrete. The request arrives at the Gateway’s 443 listener. The controller evaluates HTTPRoute rules in precedence order: hostname first, then path. On a match the route’s filters run — a header rewrite here — and backendRefs selects the orders Service, which forwards to a healthy Pod. A hostname with no matching route returns a 404 at the Gateway, never reaching a backend. This is the same evaluation an Ingress controller performs, but every step is a typed, validated field instead of an annotation interpreted by convention.
Option A — Stay on Ingress. Consequence: lowest migration cost and maximum tooling familiarity, but you remain locked into per-controller annotations, cannot cleanly separate platform and app RBAC, and have no standard path to weighted routing or non-HTTP protocols. Acceptable for small, single-team clusters; a growing liability for platforms.
Option B — Adopt the Gateway API for north-south. Consequence: you gain typed expressiveness, role separation, portability across conformant controllers, and first-class traffic splitting. The cost is a new mental model, the need for a Gateway-conformant controller, and running two ingress systems during migration. This is the strategic default for any multi-tenant platform in 2026.
Option C — Add a service mesh for east-west. Consequence: you get mTLS-by-default, retries, and uniform telemetry between services — genuinely valuable for zero-trust and multi-service architectures. The cost is real: extra proxies consume CPU and memory, add a network hop and latency, and introduce significant operational surface. Adopt it because you need its east-west guarantees, not because it is fashionable.
| Dimension | Ingress | Gateway API | Service Mesh |
|---|---|---|---|
| Traffic direction | North-south | North-south (GAMMA adds east-west) | East-west (primarily) |
| Expressiveness | Host/path + annotations | Typed L7: rewrites, splits, filters | Full L7 + traffic policy |
| Role separation | None (one resource) | Strong (GatewayClass/Gateway/Route) | Moderate (mesh + route owners) |
| Portability | Low (vendor annotations) | High (conformance program) | Low to moderate |
| mTLS | Not native | Not native (mesh adds it) | Native, automatic |
| Complexity | Low | Medium | High |
The matrix makes the division of labour obvious. Ingress and Gateway API compete for the same north-south slot, and Gateway API wins everywhere except raw simplicity. Mesh occupies a different slot entirely — it is not a north-south competitor, and treating it as one is a common and expensive category error.
It is worth grounding Option C’s cost in concrete numbers, because “the mesh tax is real” is easy to wave away until it shows up on a capacity plan. Take a mid-sized platform of 300 Pods. In a full sidecar mesh that is 300 extra proxy containers; at a conservative 40 MB and 0.05 vCPU of idle reservation each, that is roughly 12 GB of memory and 15 vCPUs consumed before a single request flows — capacity you must provision and pay for whether or not the mesh is doing useful work. Add the per-hop latency: a call that traverses caller-sidecar and callee-sidecar pays two extra proxy hops, and while each is small, they compound across a deep call graph. None of this is an argument against a mesh; it is an argument for adopting one with eyes open. If the same platform only needs weighted canary routing at the edge, the Gateway API delivers that with zero added proxies, and the capacity comparison is not close.
Migrating from Ingress to Gateway API
The migration is evolutionary, not a flag day. The community ships a tool, ingress2gateway, that reads existing Ingress resources (and several providers’ annotation dialects) and emits equivalent Gateway and HTTPRoute manifests. Treat its output as a first draft: it converts the standard cases and flags annotations it cannot translate, which is exactly where you need a human to decide on the typed equivalent.
The safe pattern is a parallel run, shown in Figure 4. Stand up a new Gateway alongside the existing Ingress controller so both serve traffic. Convert one application’s Ingress with ingress2gateway, review the generated routes, and deploy them attached to the new Gateway on a test hostname or shadow path. Validate response codes and latency, then shift real traffic by moving DNS or the load-balancer VIP. Monitor error rates and tail latency; if anything regresses, roll back by pointing DNS at the old Ingress, which you have not yet removed. Only after a clean soak period do you decommission the old Ingress for that app. Repeat per application, never big-bang, so blast radius stays scoped to one team at a time.
Two practical details make or break this migration. First, the cutover mechanism matters: if both the Ingress and the Gateway sit behind the same external load balancer, the cleanest switch is at DNS or at the LB’s backend target, because that keeps the change at one layer and gives you a fast, atomic rollback. If instead each provisions its own cloud load balancer with its own IP, your “rollback” is a DNS TTL wait, so set a low TTL on the record before you start. Second, do not trust ingress2gateway for the hard cases. It handles host and path rules and the common rewrite annotations well, but bespoke annotations — custom Lua snippets, auth-subrequest configs, mirroring rules — either drop out or come through as a warning. Those are precisely the routes that carry the most risk, so budget engineering time to re-express them as typed filters and test them under load before cutover, not after.
Trade-offs, Gotchas, and What Goes Wrong

Figure 4: A per-application migration loop — convert, review, parallel-deploy, shadow-test, cut over, and decommission only after a healthy soak, with a rollback path at every step.
The biggest trap is implementation conformance gaps. The Gateway API defines a large surface, but a given controller only implements part of it, and the project tracks this through a conformance program with named feature channels. Two controllers can both claim “Gateway API support” while differing on whether they implement, say, GRPCRoute or specific filter types. Always check the conformance report for the exact features you depend on before committing — “supports Gateway API” is not a yes/no fact.
A related gotcha is controller maturity. Some Gateway implementations are production-hardened with years of mileage; others are newer ports of an existing proxy. The API being GA does not make every controller equally mature. Evaluate the controller, not just the spec.
The third failure is over-adopting a mesh. Teams install Istio for one canary deployment and inherit thousands of proxies, a new control plane to operate, and a latency budget they did not plan for. If your only requirement is north-south canary routing, the Gateway API’s weighted backendRefs deliver it with none of the mesh tax. Reach for a mesh when you genuinely need automatic mTLS and east-west policy across many services — not before.
Finally, watch CRD and version skew. The Gateway API ships as CRDs installed separately from Kubernetes itself, and the controller must match the CRD version (v1, v1beta1). A cluster upgrade that pulls a newer controller against stale CRDs produces routes that validate but never program the data plane — a silent, infuriating failure. The symptom is maddening: kubectl get httproute shows your object, the YAML is valid, but no traffic flows, because the controller is reconciling a CRD version the route was not written against. Pin and reconcile CRD versions through your GitOps tooling rather than applying them by hand, and make a habit of reading the route’s status.conditions after every apply — Accepted: False with a reason is the system telling you exactly what went wrong, and it is the single most underused debugging signal in the whole API.
A subtler anti-pattern is scattering ownership without governance. Role separation is a feature, but if every team can attach any route to the shared production Gateway, you have traded annotation sprawl for route sprawl. Use th
