5.6 KiB
5.6 KiB
ARM64 Raspberry Pi Worker Node Strategy
- Status: accepted
- Date: 2026-02-05
- Deciders: Billy
- Technical Story: Integrate Raspberry Pi nodes into the Kubernetes cluster
Context and Problem Statement
The homelab cluster includes 5 Raspberry Pi 4/5 nodes (ARM64 architecture) alongside x86_64 servers. These low-power nodes provide:
- Additional compute capacity for lightweight workloads
- Geographic distribution within the home network
- Learning platform for multi-architecture Kubernetes
However, ARM64 nodes have constraints:
- No GPU acceleration
- Lower CPU/memory than x86_64 servers
- Some container images lack ARM64 support
- Limited local storage
How do we effectively integrate ARM64 nodes while avoiding scheduling failures?
Decision Drivers
- Maximize utilization of ARM64 compute
- Prevent ARM-incompatible workloads from scheduling
- Maintain cluster stability
- Support multi-arch container images
- Minimize operational overhead
Considered Options
- Node labels + affinity for workload placement
- Separate ARM64-only namespace
- Taints to exclude from general scheduling
- ARM64 nodes for specific workload types only
Decision Outcome
Chosen option: Option 1 + Option 4 hybrid - Use node labels with affinity rules, and designate ARM64 nodes for specific workload categories.
ARM64 nodes handle:
- Lightweight control plane components (where multi-arch images exist)
- Velero node-agent (backup DaemonSet)
- Node-level monitoring (Prometheus node-exporter)
- Future: Edge/IoT workloads
Positive Consequences
- Clear workload segmentation
- No scheduling failures from arch mismatch
- Efficient use of low-power nodes
- Room for future ARM-specific workloads
- Cost-effective cluster expansion
Negative Consequences
- Some nodes may be underutilized
- Must maintain multi-arch image awareness
- Additional scheduling complexity
Cluster Composition
| Node | Architecture | Role | Instance Type |
|---|---|---|---|
| bruenor | amd64 | control-plane | - |
| catti | amd64 | control-plane | - |
| storm | amd64 | control-plane | - |
| khelben | amd64 | GPU worker (Strix Halo) | - |
| elminster | amd64 | GPU worker (NVIDIA) | - |
| drizzt | amd64 | GPU worker (RDNA2) | - |
| danilo | amd64 | GPU worker (Intel Arc) | - |
| regis | amd64 | worker | - |
| wulfgar | amd64 | worker | - |
| durnan | arm64 | worker | raspberry-pi |
| elaith | arm64 | worker | raspberry-pi |
| jarlaxle | arm64 | worker | raspberry-pi |
| mirt | arm64 | worker | raspberry-pi |
| volo | arm64 | worker | raspberry-pi |
Node Labels
# Applied via Talos machine config or kubectl
labels:
kubernetes.io/arch: arm64
kubernetes.io/os: linux
node.kubernetes.io/instance-type: raspberry-pi
kubernetes.io/storage: none # No Longhorn on Pis
Workload Placement
DaemonSets (Run Everywhere)
These run on all nodes including ARM64:
| DaemonSet | Namespace | Multi-arch |
|---|---|---|
| velero-node-agent | velero | ✅ |
| cilium-agent | kube-system | ✅ |
| node-exporter | observability | ✅ |
ARM64-Excluded Workloads
These explicitly exclude ARM64 via node affinity:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
| Workload Type | Reason for Exclusion |
|---|---|
| GPU workloads | No GPU on Pis |
| Longhorn | Pis have no storage label |
| Heavy databases | Insufficient resources |
| Most HelmReleases | Image compatibility |
ARM64-Compatible Light Workloads
Potential future workloads for ARM64 nodes:
| Workload | Use Case |
|---|---|
| MQTT broker | IoT message routing |
| Pi-hole | DNS ad blocking |
| Home Assistant | Home automation |
| Lightweight proxies | Traffic routing |
Storage Exclusion
ARM64 nodes are excluded from Longhorn:
# Longhorn Helm values
defaultSettings:
systemManagedComponentsNodeSelector: "kubernetes.io/arch:amd64"
Node label:
kubernetes.io/storage: none
Resource Constraints
| Node Type | CPU | Memory | Typical Available |
|---|---|---|---|
| Raspberry Pi 4 | 4 cores | 4-8GB | 3 cores, 3GB |
| Raspberry Pi 5 | 4 cores | 8GB | 3.5 cores, 6GB |
Multi-Architecture Image Strategy
For workloads that should run on ARM64:
- Use multi-arch base images (e.g.,
alpine,debian) - Build with Docker buildx:
docker buildx build --platform linux/amd64,linux/arm64 -t myimage:latest . - Verify arch support before deployment
Monitoring ARM64 Nodes
# Node resource usage by architecture
sum by (node, arch) (
node_memory_MemAvailable_bytes{}
* on(node) group_left(arch)
kube_node_labels{label_kubernetes_io_arch!=""}
)
Future Considerations
- Edge workloads: ARM64 nodes ideal for edge compute patterns
- IoT integration: MQTT, sensor data collection
- Scale-out: Add more Pis for lightweight workload capacity
- ARM64 ML inference: Some models support ARM (TensorFlow Lite)
Links
- Kubernetes Multi-Architecture
- Talos on Raspberry Pi
- Related: ADR-0002 - Use Talos Linux
- Related: ADR-0026 - Storage Strategy