# ARM64 Raspberry Pi Worker Node Strategy * Status: accepted * Date: 2026-02-05 * Deciders: Billy * Technical Story: Integrate Raspberry Pi nodes into the Kubernetes cluster ## Context and Problem Statement The homelab cluster includes 5 Raspberry Pi 4/5 nodes (ARM64 architecture) alongside x86_64 servers. These low-power nodes provide: - Additional compute capacity for lightweight workloads - Geographic distribution within the home network - Learning platform for multi-architecture Kubernetes However, ARM64 nodes have constraints: - No GPU acceleration - Lower CPU/memory than x86_64 servers - Some container images lack ARM64 support - Limited local storage How do we effectively integrate ARM64 nodes while avoiding scheduling failures? ## Decision Drivers * Maximize utilization of ARM64 compute * Prevent ARM-incompatible workloads from scheduling * Maintain cluster stability * Support multi-arch container images * Minimize operational overhead ## Considered Options 1. **Node labels + affinity for workload placement** 2. **Separate ARM64-only namespace** 3. **Taints to exclude from general scheduling** 4. **ARM64 nodes for specific workload types only** ## Decision Outcome Chosen option: **Option 1 + Option 4 hybrid** - Use node labels with affinity rules, and designate ARM64 nodes for specific workload categories. ARM64 nodes handle: - Lightweight control plane components (where multi-arch images exist) - Velero node-agent (backup DaemonSet) - Node-level monitoring (Prometheus node-exporter) - Future: Edge/IoT workloads ### Positive Consequences * Clear workload segmentation * No scheduling failures from arch mismatch * Efficient use of low-power nodes * Room for future ARM-specific workloads * Cost-effective cluster expansion ### Negative Consequences * Some nodes may be underutilized * Must maintain multi-arch image awareness * Additional scheduling complexity ## Cluster Composition | Node | Architecture | Role | Instance Type | |------|--------------|------|---------------| | bruenor | amd64 | control-plane | - | | catti | amd64 | control-plane | - | | storm | amd64 | control-plane | - | | khelben | amd64 | GPU worker (Strix Halo) | - | | elminster | amd64 | GPU worker (NVIDIA) | - | | drizzt | amd64 | GPU worker (RDNA2) | - | | danilo | amd64 | GPU worker (Intel Arc) | - | | regis | amd64 | worker | - | | wulfgar | amd64 | worker | - | | **durnan** | **arm64** | worker | raspberry-pi | | **elaith** | **arm64** | worker | raspberry-pi | | **jarlaxle** | **arm64** | worker | raspberry-pi | | **mirt** | **arm64** | worker | raspberry-pi | | **volo** | **arm64** | worker | raspberry-pi | ## Node Labels ```yaml # Applied via Talos machine config or kubectl labels: kubernetes.io/arch: arm64 kubernetes.io/os: linux node.kubernetes.io/instance-type: raspberry-pi kubernetes.io/storage: none # No Longhorn on Pis ``` ## Workload Placement ### DaemonSets (Run Everywhere) These run on all nodes including ARM64: | DaemonSet | Namespace | Multi-arch | |-----------|-----------|------------| | velero-node-agent | velero | ✅ | | cilium-agent | kube-system | ✅ | | node-exporter | observability | ✅ | ### ARM64-Excluded Workloads These explicitly exclude ARM64 via node affinity: ```yaml spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values: - amd64 ``` | Workload Type | Reason for Exclusion | |---------------|----------------------| | GPU workloads | No GPU on Pis | | Longhorn | Pis have no storage label | | Heavy databases | Insufficient resources | | Most HelmReleases | Image compatibility | ### ARM64-Compatible Light Workloads Potential future workloads for ARM64 nodes: | Workload | Use Case | |----------|----------| | MQTT broker | IoT message routing | | Pi-hole | DNS ad blocking | | Home Assistant | Home automation | | Lightweight proxies | Traffic routing | ## Storage Exclusion ARM64 nodes are excluded from Longhorn: ```yaml # Longhorn Helm values defaultSettings: systemManagedComponentsNodeSelector: "kubernetes.io/arch:amd64" ``` Node label: ```yaml kubernetes.io/storage: none ``` ## Resource Constraints | Node Type | CPU | Memory | Typical Available | |-----------|-----|--------|-------------------| | Raspberry Pi 4 | 4 cores | 4-8GB | 3 cores, 3GB | | Raspberry Pi 5 | 4 cores | 8GB | 3.5 cores, 6GB | ## Multi-Architecture Image Strategy For workloads that should run on ARM64: 1. **Use multi-arch base images** (e.g., `alpine`, `debian`) 2. **Build with Docker buildx**: ```bash docker buildx build --platform linux/amd64,linux/arm64 -t myimage:latest . ``` 3. **Verify arch support** before deployment ## Monitoring ARM64 Nodes ```promql # Node resource usage by architecture sum by (node, arch) ( node_memory_MemAvailable_bytes{} * on(node) group_left(arch) kube_node_labels{label_kubernetes_io_arch!=""} ) ``` ## Future Considerations - **Edge workloads**: ARM64 nodes ideal for edge compute patterns - **IoT integration**: MQTT, sensor data collection - **Scale-out**: Add more Pis for lightweight workload capacity - **ARM64 ML inference**: Some models support ARM (TensorFlow Lite) ## Links * [Kubernetes Multi-Architecture](https://kubernetes.io/docs/concepts/containers/images/#multi-architecture-images) * [Talos on Raspberry Pi](https://talos.dev/v1.12/talos-guides/install/single-board-computers/rpi_generic/) * Related: [ADR-0002](0002-use-talos-linux.md) - Use Talos Linux * Related: [ADR-0026](0026-storage-strategy.md) - Storage Strategy