# Use Talos Linux for Kubernetes Nodes * Status: accepted * Date: 2025-11-30 * Deciders: Billy Davies * Technical Story: Selecting OS for bare-metal Kubernetes cluster ## Context and Problem Statement We need a reliable, secure operating system for running Kubernetes on bare-metal homelab nodes. The OS should minimize attack surface, be easy to manage at scale, and support our GPU requirements (AMD ROCm, NVIDIA CUDA, Intel). ## Decision Drivers * Security-first design (immutable, minimal) * API-driven management (no SSH) * Support for various GPU drivers * Kubernetes-native focus * Community support and updates * Ease of upgrades ## Considered Options * Ubuntu Server with kubeadm * Flatcar Container Linux * Talos Linux * k3OS (discontinued) * Rocky Linux with RKE2 ## Decision Outcome Chosen option: "Talos Linux", because it provides an immutable, API-driven, Kubernetes-focused OS that minimizes attack surface and simplifies operations. ### Positive Consequences * Immutable root filesystem prevents drift * No SSH reduces attack vectors * API-driven management integrates well with GitOps * Schematic system allows custom kernel modules (GPU drivers) * Consistent configuration across all nodes * Automatic updates with minimal disruption ### Negative Consequences * Learning curve for API-driven management * Debugging requires different approaches (no SSH) * Custom extensions require schematic IDs * Less flexibility for non-Kubernetes workloads ## Pros and Cons of the Options ### Ubuntu Server with kubeadm * Good, because familiar * Good, because extensive package availability * Good, because easy debugging via SSH * Bad, because mutable system leads to drift * Bad, because large attack surface * Bad, because manual package management ### Flatcar Container Linux * Good, because immutable * Good, because auto-updates * Good, because container-focused * Bad, because less Kubernetes-specific * Bad, because smaller community than Talos * Bad, because GPU driver setup more complex ### Talos Linux * Good, because purpose-built for Kubernetes * Good, because immutable and minimal * Good, because API-driven (no SSH) * Good, because excellent Kubernetes integration * Good, because active development and community * Good, because schematic system for GPU drivers * Bad, because learning curve * Bad, because no traditional debugging ### k3OS * Good, because simple * Bad, because discontinued ### Rocky Linux with RKE2 * Good, because enterprise-like * Good, because familiar Linux experience * Bad, because mutable system * Bad, because more operational overhead * Bad, because larger attack surface ## Links * [Talos Linux](https://talos.dev) * [Talos Image Factory](https://factory.talos.dev) * Related: [ADR-0005](0005-multi-gpu-strategy.md) - GPU driver integration via schematics