# GitOps with Flux CD * Status: accepted * Date: 2025-11-30 * Deciders: Billy Davies * Technical Story: Implementing GitOps for cluster management ## Context and Problem Statement Managing a Kubernetes cluster with numerous applications, configurations, and secrets requires a reliable, auditable, and reproducible approach. Manual `kubectl apply` is error-prone and doesn't track state over time. ## Decision Drivers * Infrastructure as Code (IaC) principles * Audit trail for all changes * Self-healing cluster state * Multi-repository support * Secret encryption integration * Active community and maintenance ## Considered Options * Manual kubectl apply * ArgoCD * Flux CD * Rancher Fleet * Pulumi/Terraform for Kubernetes ## Decision Outcome Chosen option: "Flux CD", because it provides a mature GitOps implementation with excellent multi-source support, SOPS integration, and aligns well with the Kubernetes ecosystem. ### Positive Consequences * Git is single source of truth * Automatic drift detection and correction * Native SOPS/Age secret encryption * Multi-repository support (homelab-k8s2 + llm-workflows) * Helm and Kustomize native support * Webhook-free sync (pull-based) ### Negative Consequences * No built-in UI (use CLI or third-party) * Learning curve for CRD-based configuration * Debugging requires understanding Flux controllers ## Configuration ### Repository Structure ``` homelab-k8s2/ ├── kubernetes/ │ ├── flux/ # Flux system config │ │ ├── config/ │ │ │ ├── cluster.yaml │ │ │ └── secrets.yaml # SOPS encrypted │ │ └── repositories/ │ │ ├── helm/ # HelmRepositories │ │ └── git/ # GitRepositories │ └── apps/ # Application Kustomizations ``` ### Multi-Repository Sync ```yaml # GitRepository for Gitea repos (daviestechlabs org) # Examples: argo, kubeflow, chat-handler, voice-assistant apiVersion: source.toolkit.fluxcd.io/v1 kind: GitRepository metadata: name: argo-workflows namespace: flux-system spec: url: https://git.daviestechlabs.io/daviestechlabs/argo.git ref: branch: main # Public repos don't need secretRef ``` Note: The monolithic `llm-workflows` repo has been decomposed into separate repos in the daviestechlabs Gitea organization. See AGENT-ONBOARDING.md for the full list. ### SOPS Integration ```yaml # .sops.yaml creation_rules: - path_regex: .*\.sops\.yaml$ age: >- age1... # Public key ``` ## Pros and Cons of the Options ### Manual kubectl apply * Good, because simple * Good, because no setup * Bad, because no audit trail * Bad, because no drift detection * Bad, because not reproducible ### ArgoCD * Good, because great UI * Good, because app-of-apps pattern * Good, because large community * Bad, because heavier resource usage * Bad, because webhook-dependent sync * Bad, because SOPS requires plugins ### Flux CD * Good, because lightweight * Good, because pull-based (no webhooks) * Good, because native SOPS support * Good, because multi-source/multi-tenant * Good, because Kubernetes-native CRDs * Bad, because no built-in UI * Bad, because CRD learning curve ### Rancher Fleet * Good, because integrated with Rancher * Good, because multi-cluster * Bad, because Rancher ecosystem lock-in * Bad, because smaller community ### Pulumi/Terraform * Good, because familiar IaC tools * Good, because drift detection * Bad, because not Kubernetes-native * Bad, because requires state management * Bad, because not continuous reconciliation ## Links * [Flux CD](https://fluxcd.io) * [SOPS Integration](https://fluxcd.io/flux/guides/mozilla-sops/) * [flux-local](https://github.com/allenporter/flux-local) - Local testing