Kubernetes at the Edge with K3s: A Production-Grade Setup Guide
Edge isn’t just “smaller Kubernetes”—it’s a different problem. In the cloud, you assume reliable networks, elastic compute, and upstream control planes. At the factory floor, telco tower, or remote research station, you face intermittent WAN, power cycling, storage wear, and no “call AWS support.” K3s edge Kubernetes production requires rethinking not just cluster size, but architectural constraints: single-process binaries, embedded datastores, air-gap image sync, and ARM64 compatibility.
This guide walks you through a production-ready K3s edge cluster architecture with hardened installation, HA topology choices, day-2 operations, and the first-principles reasoning behind each decision. By the end, you’ll have concrete commands, Mermaid diagrams showing topology trade-offs, and failure-mode strategies to deploy K3s on industrial-grade hardware—from ruggedized appliances to ARM64 boards.
TL;DR
Deploy a 3-server HA K3s cluster with embedded etcd, secrets encryption, and external load balancer for control-plane failover. Agent nodes at remote sites pull images from a local registry mirror; etcd snapshots ship to external storage via S3. Use system-upgrade-controller for rolling updates. Test on ARM64 early. Monitor WAN flapping and storage wear; abort workload scheduling if the edge site loses upstream connectivity for >5 minutes.
Terminology Grounding
K3s — Single-process Kubernetes distribution (<100 MB binary) bundling apiserver, scheduler, controller-manager, kubelet, and container runtime in one executable. Removes cloud-specific controllers (cloud-ccm, ebs, gcp volumes) but keeps standard Kubernetes APIs.
kine — SQL-backed key-value shim that lets K3s use PostgreSQL, MySQL, or SQLite as the backend instead of etcd. Allows external database HA without running etcd directly.
Embedded etcd — etcd (Kubernetes’ default datastore) bundled inside K3s server process; no separate installation required. Single-leader consensus; failover takes 5–10 seconds.
Air-gap — Deployment scenario where edge sites have no outbound HTTPS to public registries (docker.io, gcr.io, quay.io). All container images must be pre-synced to a local mirror.
Flannel CNI — Lightweight container network interface overlay (default in K3s). Creates virtual layer-2 network across nodes; simpler than Cilium but lacks east-west encryption and granular policies.
Cilium eBPF — eBPF-based networking with built-in encryption, observability, and strict network policies. Higher memory/CPU overhead; requires modern Linux kernels (5.8+).
Node token — Shared secret K3s uses to authenticate new servers/agents joining a cluster without passing plaintext credentials.
Secrets encryption — Encryption-at-rest for Kubernetes secrets using a static key or KMS. Prevents accidental exposure if etcd snapshots are leaked.
WAN flapping — Repeated loss/recovery of edge site’s upstream network link. Causes API request timeouts, etcd split-brain risk, and workload eviction.
Edge Context and Constraints
Industrial IoT deployments (factories, telco, research) differ from cloud clusters in five ways:
-
Power and thermal limits — Edge nodes run fanless or in sealed enclosures. A 4-CPU/4GB RAM industrial PC draws 15–30W; add dense GPU acceleration and you’re limited to passive cooling. Kubernetes overhead (apiserver, kubelet, CNI) must fit in that power budget.
-
Intermittent WAN — Edge sites often connect via cellular, satellite, or low-quality fiber. A 50ms RTT surge or 10-minute outage is normal. Kubernetes assumes stable network; long API latencies cause watches to timeout, node leases to miss renewal, and workloads to evict.
-
Storage wear — Edge hardware uses eMMC, SSD, or NVMe with limited write cycles (~1000–5000 terabytes written). Kubernetes’ frequent writes (etcd, container logs, kubelet state) can wear storage in 2–3 years. Need proactive log rotation and etcd snapshot management.
-
No upstream KMS — Cloud deployments use AWS KMS, GCP Cloud KMS, or HashiCorp Vault. Edge sites often have no such luxury; secrets encryption must use local static keys or a lightweight KMS proxy.
-
ARM64 prevalence — Edge hardware is often ARM64 (Raspberry Pi, NVIDIA Jetson, Qualcomm Snapdragon). Container images must be multi-arch; single-arch
linux/amd64images fail with “exec format error.”
Given these constraints, K3s is the right tool when:
– You need standard Kubernetes APIs (pod, service, deployment, statefulset).
– Your edge site has 2–4 physical machines (HA cluster) or can tolerate single-point-of-failure.
– You can pre-stage container images or run a local registry mirror.
– Your control plane can absorb 10–50ms API latency and occasional upstream disconnects.
When K3s is the wrong fit: ultra-lightweight single-board setups (<512MB RAM, no HA needed), real-time hard deadlines (<100ms control latency), or deep offline operation (weeks without upstream contact). In those cases, use MicroK8s, Docker Compose, or POSIX process control.
Diagram 1: Edge Cluster at 30,000 Feet
An edge K3s cluster sits at a remote site (factory floor, telco tower) and communicates with a central management plane over WAN. This topology decouples site-local failures from central orchestration.

Key flows:
– Raft consensus between servers — K3s embeds etcd; the three servers elect a leader and replicate state.
– HAProxy/KeepAlived in front — Single entry point for kubelet, kubectl, and apiserver clients; masks leader failover.
– Local registry mirror — Agent nodes pull images locally; no repeated WAN downloads.
– etcd snapshots to external storage — NFS mount or S3 sync ensures recovery capability.
– Edge observability sidecar — Prometheus scrapes apiserver and kubelet metrics (port 10259, 10250); forwards to central stack over WAN.
– GitOps over tunnel — Central ArgoCD/Flux repo pushes workloads down; edge cluster pulls and applies manifests.
This architecture tolerates site-local failures (agent node crash, container restart) without central intervention, yet maintains auditability and workload synchronization via GitOps.
Diagram 2: K3s Internal Components
K3s ships as a single binary, but internally runs multiple processes. Understanding this is key to tuning resource allocation and debugging failures.

Why one binary? Upstream Kubernetes runs ~5 separate binaries (apiserver, scheduler, controller-manager, kubelet, kube-proxy). K3s bundles these into a single executable because:
– Easier installation (one curl | sh).
– Shared memory between components (no IPC overhead).
– Simplified hardening (single entrypoint, single systemd unit).
– Predictable resource footprint.
kine abstraction: Instead of embedding etcd directly, K3s wraps it with kine—a thin SQL-to-etcd compatibility layer. This allows:
– Using PostgreSQL or MySQL as backend (via --storage-endpoint postgres://...).
– Simpler HA (external DB replication instead of etcd quorum).
– Easier disaster recovery (standard SQL backups).
Embedded etcd: If you don’t set --storage-endpoint, K3s runs etcd in-process. It’s a single-leader consensus system: one node is leader; others replicate. Failover to a new leader takes 5–10 seconds (election timeout + re-election).
Diagram 3: HA Topology Choices (Embedded vs External)
K3s supports three HA architectures. Choose based on your disaster recovery (DR) and failover tolerance.

A: Embedded etcd (3-node HA, cluster-init)
– Pros: No external dependencies; self-healing raft consensus; zero extra DB management.
– Cons: Failover takes 5–10 seconds (election + apiserver restart); all 3 nodes must be available to form quorum; etcd compaction is automatic but can spike CPU/disk.
– Use when: You have 3+ stable edge machines and can tolerate short outages. Data center or co-lo scenario.
– Failover test: Kill the leader node; watch election in logs (sudo journalctl -u k3s -f | grep "elected leader").
B: External PostgreSQL backend (kine + SQL)
– Pros: Simpler HA (PostgreSQL replication is well-known); faster failover (seconds, not 5–10s); can use managed databases (AWS RDS, Azure PostgreSQL).
– Cons: Requires running a separate database; kine adds latency (~1–5ms per API call); SQL backups must be coordinated.
– Use when: You have existing PostgreSQL infrastructure or need sub-5s failover. Multi-region deployments.
– Flag: K3S_STORAGE_ENDPOINT="postgres://user:pass@db.factory.local/k3s"
C: Single server + snapshots (minimal HA)
– Pros: Simplest deployment; embedded etcd is lightweight.
– Cons: Single point of failure; recovery requires restoring from snapshot (manual process, downtime).
– Use when: Development, lab, or sites where <1 hour of downtime is acceptable.
– Recovery: sudo k3s server --cluster-reset-restore-path=/path/to/snapshot.db
Recommendation for production edge: Architecture A (embedded etcd + HAProxy VIP). Embedded etcd gives you zero external DB overhead, and HAProxy is a 5-line config that masks failover. Cost is three nodes and 5–10 second failover; benefit is operational simplicity and deterministic HA.
Installation: 3-Node HA Cluster with Embedded etcd
Prerequisites
- 3× machines: 4 CPU, 4GB RAM, Ubuntu 22.04 LTS or Rocky Linux 9. (Tested: Lenovo ThinkEdge, ASUS Edge appliances, Raspberry Pi 4 8GB.)
- Static IPs: 10.0.1.100–102 for servers; 10.0.2.1+ for agents.
- Time sync: NTP enabled across all nodes (
timedatectl set-ntp true). - Outbound HTTPS: Access to k3s.io, or pre-download binary for air-gap.
- Root or sudoers: Required for systemd, iptables, mount operations.
Step 1: Bootstrap Server 1 (Cluster Init)
Server 1 becomes the initial cluster leader and initializes etcd quorum.
#!/bin/bash
# On 10.0.1.100 (server1)
# Disable firewall for inter-node communication
# (Production: add iptables rules for 6443, 10250, 2379-2380)
sudo ufw disable
# Create k3s config directory
sudo mkdir -p /etc/rancher/k3s
# Generate TLS SAN list (all server IPs + load balancer VIP)
cat <<'EOF' | sudo tee /etc/rancher/k3s/config.yaml
---
write-kubeconfig-mode: "0600"
tls-san:
- "10.0.1.100"
- "10.0.1.101"
- "10.0.1.102"
- "10.0.1.50" # HAProxy VIP
- "k3s-lb.factory.local"
- "*.factory.local"
disable:
- traefik # Install ingress separately
- local-storage # Use external storage class
secrets-encryption: true
etcd-expose-metrics: true # For prometheus scrape
kube-apiserver-arg:
- "enable-admission-plugins=NodeRestriction"
- "audit-log-maxage=30"
- "audit-log-maxbackup=10"
- "audit-log-maxsize=100"
kube-controller-manager-arg:
- "bind-address=127.0.0.1"
kubelet-arg:
- "system-reserved=cpu=200m,memory=256Mi,ephemeral-storage=1Gi"
- "kube-reserved=cpu=100m,memory=128Mi,ephemeral-storage=1Gi"
- "max-pods=110"
- "eviction-hard=memory.available<10%,nodefs.available<10%"
EOF
# Download installer
curl -sfL https://get.k3s.io -o /tmp/install-k3s.sh
chmod +x /tmp/install-k3s.sh
# Install K3s in cluster-init mode (first server, starts etcd quorum)
# K3S_CLUSTER_INIT tells K3s "this is the first node, initialize etcd"
export K3S_CLUSTER_INIT=true
export K3S_TOKEN=$(openssl rand -base64 32)
export INSTALL_K3S_VERSION=v1.30.0
sudo /tmp/install-k3s.sh server \
--config /etc/rancher/k3s/config.yaml \
--cluster-init
# Wait for apiserver to be ready
sudo k3s kubectl get nodes
# Retrieve and save token securely (pass to servers 2 & 3 via SSH)
echo "=== Node Token (save securely) ==="
sudo cat /var/lib/rancher/k3s/server/node-token
What happens:
– --cluster-init tells K3s to create a new etcd cluster (not join existing).
– systemd unit k3s is started automatically.
– apiserver listens on 0.0.0.0:6443.
– etcd listens on 127.0.0.1:2379 (internal only).
– kubelet starts, registers node with apiserver.
Verify:
sudo k3s kubectl get nodes
sudo k3s kubectl get pods -A
sudo systemctl status k3s
Step 2: Join Servers 2 & 3 to Cluster
#!/bin/bash
# On 10.0.1.101 (server2) and 10.0.1.102 (server3)
sudo mkdir -p /etc/rancher/k3s
# Use same config as server1 (TLS SAN, flags)
cat <<'EOF' | sudo tee /etc/rancher/k3s/config.yaml
---
write-kubeconfig-mode: "0600"
tls-san:
- "10.0.1.100"
- "10.0.1.101"
- "10.0.1.102"
- "10.0.1.50"
- "k3s-lb.factory.local"
disable:
- traefik
- local-storage
secrets-encryption: true
etcd-expose-metrics: true
kube-apiserver-arg:
- "enable-admission-plugins=NodeRestriction"
- "audit-log-maxage=30"
kube-controller-manager-arg:
- "bind-address=127.0.0.1"
kubelet-arg:
- "system-reserved=cpu=200m,memory=256Mi,ephemeral-storage=1Gi"
- "kube-reserved=cpu=100m,memory=128Mi,ephemeral-storage=1Gi"
- "max-pods=110"
- "eviction-hard=memory.available<10%,nodefs.available<10%"
EOF
curl -sfL https://get.k3s.io -o /tmp/install-k3s.sh
chmod +x /tmp/install-k3s.sh
# Install as server mode, joining existing cluster
export K3S_URL="https://10.0.1.100:6443"
export K3S_TOKEN="<paste-token-from-server1>"
export INSTALL_K3S_VERSION=v1.30.0
sudo /tmp/install-k3s.sh server \
--config /etc/rancher/k3s/config.yaml
# Wait for node registration
sleep 10
sudo k3s kubectl get nodes
What happens:
– K3S_URL points to an existing server; K3s joins that cluster.
– No --cluster-init flag; server 2 & 3 join the existing etcd quorum.
– Raft election happens automatically; all three nodes replicate state.
Verify quorum:
sudo k3s kubectl exec -n kube-system \
-it $(sudo k3s kubectl get pods -n kube-system -l component=etcd -o name | head -1) \
-- etcdctl member list
# Output should show 3 members:
# <id>, <name>, <peerURLs>, <clientURLs>, <isLeader>
Step 3: Deploy HAProxy/KeepAlived for Control Plane LB
On a fourth machine (or on one of the servers as a sidecar), deploy HAProxy with KeepAlived to create a virtual IP (VIP) that masks failover.
# On LB machine (10.0.1.50 reserved in /etc/hosts)
sudo apt-get update
sudo apt-get install -y haproxy keepalived
# HAProxy config: distribute traffic to all three servers
cat <<'EOF' | sudo tee /etc/haproxy/haproxy.cfg
global
log /dev/log local0
maxconn 4096
defaults
log global
mode tcp
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend k3s_api
bind 0.0.0.0:6443
default_backend k3s_servers
backend k3s_servers
balance roundrobin
server server1 10.0.1.100:6443 check
server server2 10.0.1.101:6443 check
server server3 10.0.1.102:6443 check
EOF
# KeepAlived config: VIP failover (active-passive)
cat <<'EOF' | sudo tee /etc/keepalived/keepalived.conf
vrrp_script check_haproxy {
script "/usr/bin/pgrep haproxy"
interval 2
weight -2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass Fact0ry!
}
virtual_ipaddress {
10.0.1.50/24
}
track_script {
check_haproxy
}
}
EOF
sudo systemctl restart haproxy keepalived
sudo systemctl enable haproxy keepalived
# Verify VIP is up
ip addr show | grep 10.0.1.50
Now all agents and client tools point to 10.0.1.50:6443 instead of individual server IPs. If server1 fails, KeepAlived demotes it and VIP moves to server2 or server3 automatically.
Step 4: Secrets Encryption at Rest
Enable encryption for all secrets in etcd:
# On server1
cat <<'EOF' | sudo tee /var/lib/rancher/k3s/server/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: "$(openssl rand -base64 32)"
- identity: {}
EOF
# Add flag to k3s server unit
sudo systemctl edit k3s --full
# Add to [Service] ExecStart line:
# --encryption-provider-config=/var/lib/rancher/k3s/server/encryption-config.yaml
sudo systemctl restart k3s
# Test: create a secret and verify it's encrypted in etcd
sudo k3s kubectl create secret generic test --from-literal=password=secret123 -n default
sudo k3s kubectl get secret test -o yaml | grep password
Why? Secrets in etcd are base64-encoded by default (not encrypted). If someone gets filesystem access to etcd snapshots, they can read all secrets. Encryption adds a static key barrier.
Step 5: Join Agent Nodes
Agent nodes run kubelet and the container runtime but not apiserver/etcd. They pull workloads from control plane.
#!/bin/bash
# On each agent node (10.0.2.x, 10.0.3.x)
sudo mkdir -p /etc/rancher/k3s
curl -sfL https://get.k3s.io -o /tmp/install-k3s.sh
chmod +x /tmp/install-k3s.sh
# Agent mode: connect to control plane VIP
export K3S_URL="https://10.0.1.50:6443"
export K3S_TOKEN="<token-from-server1>"
sudo /tmp/install-k3s.sh agent \
--server-arg="--write-kubeconfig-mode=0600"
# Label agent for workload placement
sudo k3s kubectl label node $(hostname) \
site=floor-a \
edge-role=data-collection \
device-type=industrial-pc
Verify all nodes:
sudo k3s kubectl get nodes -o wide
Diagram 4: HA Failover Test (Leader Election)
When the leader server crashes, etcd triggers an election and chooses a new leader. This diagram shows the transition.

Timeline:
– T=0 to T=5s: Server 1 is leader, sends heartbeats every 50ms.
– T=5s: Server 1 crashes or network fails. Servers 2 & 3 stop receiving heartbeats.
– T=5s to T=10s: Election timeout triggers (150ms default in etcd). Servers 2 & 3 vote.
– T=10s: Server 2 wins election, becomes leader, restarts apiserver.
– T=10s to T=15s: Server 3 catches up with replication.
– T=15s onward: HAProxy detects server 1 is unhealthy (TCP check fails), removes it from pool. New requests route to servers 2 & 3.
Client impact: Existing watches timeout (~5–15s). New requests after T=15s succeed. This is why production clusters run load balancers—they mask the failover and make it transparent.
Diagram 5: Air-Gap Image Sync Pipeline
Edge sites often have no outbound HTTPS to public registries. All images must be synced to a local mirror ahead of time.

Typical workflow:
# On machine with internet (e.g., CI runner)
# Sync public images to a local directory
skopeo sync \
--src docker \
--dest dir \
docker.io/library/nginx \
/tmp/images/
# Create OCI archive
tar czf images.tar.gz -C /tmp images/
# Transfer to edge site (scp, S3, USB)
scp images.tar.gz edge-site:/tmp/
# On edge site: import into local registry
sudo zot pull docker.io/library/nginx:latest
# (or run mirroring service that auto-fetches on-demand)
For K3s, configure image pull policy to prefer local mirror:
# In workload deployment:
imagePullPolicy: IfNotPresent # Try local first
# Or add image pull secret pointing to mirror:
apiVersion: v1
kind: Secret
metadata:
name: mirror-auth
namespace: default
type: kubernetes.io/dockercfg
data:
.dockercfg: <base64({"mirror.factory.local:5000": {"auth": "..."}})>
Diagram 6: Upgrade Path (Rolling Update with system-upgrade-controller)
K3s upgrades are coordinated by system-upgrade-controller, which cordons nodes, drains pods, and replaces the binary.

Install system-upgrade-controller:
# On control plane
sudo k3s kubectl apply -f \
https://github.com/rancher/system-upgrade-controller/releases/download/v0.14.0/crd.yaml
sudo k3s kubectl apply -f - <<'EOF'
apiVersion: v1
kind: Namespace
metadata:
name: system-upgrade
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: system-upgrade-controller
namespace: system-upgrade
spec:
replicas: 1
selector:
matchLabels:
app: system-upgrade-controller
template:
metadata:
labels:
app: system-upgrade-controller
spec:
containers:
- name: system-upgrade-controller
image: rancher/system-upgrade-controller:v0.14.0
imagePullPolicy: IfNotPresent
env:
- name: SYSTEM_UPGRADE_CONTROLLER_DEBUG
value: "false"
- name: SYSTEM_UPGRADE_JOB_CONCURRENCY
value: "1" # Serial upgrade (one node at a time)
volumeMounts:
- name: etc-rancher-k3s
mountPath: /etc/rancher/k3s
readOnly: true
volumes:
- name: etc-rancher-k3s
hostPath:
path: /etc/rancher/k3s
EOF
Create upgrade plan:
sudo k3s kubectl apply -f - <<'EOF'
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: k3s-server-upgrade
namespace: system-upgrade
spec:
# Targets: only nodes with label k3s-upgrade=true
nodeSelector:
matchExpressions:
- key: k3s-upgrade
operator: In
values:
- "true"
serviceAccountName: system-upgrade
# Cordon + drain before upgrade
cordon: true
drain:
force: true
skipWaitForDeleteTimeout: 60
# Upgrade image and version
upgrade:
image: rancher/k3s
tag: v1.31.0 # Target version
# Safety: only one node at a time
concurrency: 1
EOF
# Trigger upgrade on a single node first (test)
sudo k3s kubectl label node server1 k3s-upgrade=true
# Watch upgrade progress
sudo k3s kubectl get nodes -w
sudo k3s kubectl logs -n system-upgrade -f \
-l app.kubernetes.io/name=system-upgrade-controller
# Monitor etcd during upgrade
while true; do
sudo k3s kubectl exec -n kube-system \
-it $(sudo k3s kubectl get pods -n kube-system -l component=etcd -o name | head -1) \
-- etcdctl member list
sleep 5
done
Failover during upgrade: If a server crashes mid-upgrade, the remaining two quorum members keep control plane alive. Wait for server to rejoin, or uncordon it and let the upgrade restart.
ARM64 Gotchas
K3s runs on ARM64 (Raspberry Pi, NVIDIA Jetson, Qualcomm), but there are several gotchas:
1. Container images must be multi-arch
# Bad: single-arch image fails on ARM64
docker pull ubuntu:22.04 # Defaults to linux/amd64
# Error: "exec format error" when pod tries to run
# Good: multi-arch image includes arm64v8
docker pull arm64v8/ubuntu:22.04
# or use manifest lists
docker pull ubuntu:22.04 --platform linux/arm64
2. Kernel features
– K3s on ARM64 requires kernel 5.4+ for network policies, eBPF.
– Cilium CNI is slower on ARM64 than Flannel due to eBPF JIT overhead.
– Flannel VXLAN is the safer default on constrained ARM boxes.
3. Device plugin compatibility
– NVIDIA GPU device plugin works on Jetson but requires CUDA runtime.
– ARM-based TPU devices (Coral) need custom device plugins (not bundled).
4. systemd and cgroups v2
– Ubuntu 22.04 on ARM64 defaults to cgroups v2, which some older tools don’t support.
– K3s handles it, but watch for kubelet warnings.
Test ARM64 early:
# On Raspberry Pi or ARM64 machine
uname -m # Should print aarch64
lsb_release -a # Ubuntu 22.04 or later
curl -sfL https://get.k3s.io | sh -s - server --cluster-init
k3s kubectl get nodes
Air-Gap Install (No Internet)
For sites with no outbound HTTPS:
# On a machine with internet
# 1. Download K3s binary and images
mkdir -p /tmp/k3s-bundle
cd /tmp/k3s-bundle
# Download K3s binary
wget https://github.com/k3s-io/k3s/releases/download/v1.30.0/k3s
chmod +x k3s
# Download images (all required for a full install)
wget https://github.com/k3s-io/k3s/releases/download/v1.30.0/k3s-airgap-images-amd64.tar.gz
# or for ARM64:
# wget https://github.com/k3s-io/k3s/releases/download/v1.30.0/k3s-airgap-images-arm64.tar.gz
# Bundle
tar czf k3s-bundle-v1.30.0.tar.gz k3s k3s-airgap-images-*.tar.gz
# Transfer to edge site (scp, rsync, USB)
scp k3s-bundle-v1.30.0.tar.gz edge-site:/tmp/
# 2. On edge site (air-gap)
cd /tmp
tar xzf k3s-bundle-v1.30.0.tar.gz
# Copy binary to PATH
sudo cp k3s /usr/local/bin/
sudo chmod +x /usr/local/bin/k3s
# Install K3s (it will use pre-downloaded images)
export INSTALL_K3S_SKIP_DOWNLOAD=true
curl -sfL https://get.k3s.io | sh -s - server --cluster-init
# Or manually place images in expected directory
sudo mkdir -p /var/lib/rancher/k3s/agent/images
sudo tar xzf k3s-airgap-images-*.tar.gz -C /var/lib/rancher/k3s/agent/images/
sudo systemctl restart k3s
Edge-Specific Failure Modes
WAN flapping — Repeated loss/recovery of upstream link causes:
– API request timeouts (watches expire, connections reset).
– Node lease misses (kubelet can’t renew; node gets marked NotReady).
– Workload eviction (pods are deleted after 5m of NotReady).
Mitigation:
– Increase node-status-update-frequency (default: 10s) to tolerate short gaps.
– Use pod disruption budgets (PDB) to prevent cascading evictions.
– Disable automatic pod eviction for critical workloads.
# Increase lease renewal frequency on kubelet
cat >> /etc/rancher/k3s/kubelet.config.yaml <<EOF
nodeStatusUpdateFrequency: 20s
EOF
Power cycling — Site loses power, UPS drains, nodes restart:
– etcd needs fsync() integrity; improper shutdown can corrupt state.
– Solution: Enable ordered shutdown via systemd RemainAfterExit=yes.
# In /etc/systemd/system/k3s.service.d/override.conf
[Unit]
After=network-online.target
[Service]
RemainAfterExit=yes
Restart=always
Storage wear — eMMC/SSD limited write cycles; etcd writes constantly:
– Monitor S.M.A.R.T status (smartctl -a /dev/mmcblk0).
– Reduce etcd compaction frequency (default: 1h).
– Mount /var/lib/rancher on a separate high-endurance storage if possible.
# Check storage health
sudo smartctl -a /dev/mmcblk0 | grep "Remaining"
# Reduce etcd compaction
--etcd-compact-retention=20m # Compact every 20 minutes instead of 1h
Real-World Implications
Scenario 1: Factory floor with 100 edge sites
– Each site runs 3-node K3s HA cluster (300 machines).
– Central GitOps (ArgoCD) pushes workloads to all clusters.
– etcd snapshots are backed up to S3 nightly.
– Failover time: <30 seconds (HAProxy detects down server, routes to healthy ones).
– RTO (recovery time objective): <1 hour (restore from etcd snapshot, rejoin cluster).
– Data loss window: 12 hours (snapshot frequency).
Scenario 2: Telco edge with intermittent satellite uplink
– Sites connect via satellite (500ms latency, 10 Mbps). WAN flapping every 2–3 hours.
– Use external PostgreSQL backend (kine) for faster failover.
– Edge sites are isolated; workload orchestration is local.
– Central monitoring pulls metrics over low-bandwidth satellite link.
– API latency tolerance: adjust --audit-dynamic-audit-maxage=5m to buffer requests.
Scenario 3: Offline research station (weeks without WAN)
– Single K3s server with embedded etcd.
– etcd snapshots are manually copied out via USB on weekly supply runs.
– No GitOps pull; workloads are pre-staged and run locally.
– Metrics are logged locally; exported post-hoc.
– Failover: manual (restore from snapshot, reboot).
First-Principles: Why K3s Design Choices Matter
Why a single binary? Kubernetes traditionally runs 5+ separate binaries (apiserver, scheduler, controller-manager, kubelet, kube-proxy). K3s bundles them because:
– Single entry point = single failure mode (one systemd unit instead of 5).
– Shared memory between components (no IPC latency).
– Easier hardening (one attack surface for TLS, RBAC, secrets).
– Simpler install (curl | sh instead of multi-step orchestration).
Why kine? etcd is a distributed consensus system, which is overkill for single-datastore scenarios. kine abstracts the storage backend so you can:
– Use PostgreSQL/MySQL replication (well-understood, operational experience) instead of etcd’s custom raft.
– Simplify HA (database replication is easier than etcd quorum management).
– Leverage managed databases (AWS RDS, Azure Database).
– Trade-off: kine adds ~1–5ms latency per API call vs etcd’s <1ms.
Why Flannel by default? Flannel is 15–20MB; Cilium is 50–100MB. At edge sites with <4GB RAM, that 80MB overhead is real. Flannel VXLAN is good enough for most workloads; Cilium shines when you need zero-trust network policies or eBPF-based observability (requires modern kernels and more CPU).
Why embedded etcd snapshot backups? etcd snapshots are point-in-time copies of cluster state. Restoring from a snapshot requires:
– Stopping the cluster (downtime).
– Restoring the snapshot file.
– Restarting etcd.
Total: ~5–10 minutes. For DR planning, assume you’ll lose everything since the last snapshot.
Further Reading
- K3s Official Docs — installation, configuration, architecture.
- Rancher K3s GitHub — source, releases, security advisories.
- etcd Disaster Recovery — backup, restore, corruption recovery.
- Kubernetes Edge Computing Patterns — node management at scale.
- NSA/CISA Kubernetes Hardening Guidance — security baselines for K3s.
- KeepAlived and HAProxy for HA — VIP failover implementation.
- Related: ArgoCD vs Flux: GitOps for edge cluster synchronization.
- Related: Unified Namespace Architecture: orchestrating data flows across edge clusters.
Deploy K3s edge with confidence. Start with a 3-server HA cluster behind an HAProxy VIP, enable secrets encryption, set up etcd snapshot backups, and mirror images for air-gap sites. Test failover on day one (kill a server, watch election). Monitor WAN link quality and storage wear. K3s scales from a fanless single-board computer to thousands of edge nodes—the architecture you build now determines how far you can push it.
