- 0017: Secrets Management Strategy (SOPS + Vault + External Secrets) - 0018: Security Policy Enforcement (Gatekeeper + Trivy)
6.5 KiB
Secrets Management Strategy
- Status: accepted
- Date: 2026-02-04
- Deciders: Billy
- Technical Story: Establish a secure, GitOps-compatible secrets management approach for the homelab
Context and Problem Statement
Managing secrets in a Kubernetes environment presents challenges: secrets must be available to applications, versionable in Git for GitOps, yet never exposed in plain text in repositories. The homelab needs a solution that balances security with operational simplicity.
How do we manage secrets securely while maintaining GitOps principles and enabling applications to access credentials at runtime?
Decision Drivers
- GitOps compatibility - secrets must be manageable through Git workflows
- Security - no plain text secrets in repositories or logs
- Operational simplicity - minimize manual secret rotation burden
- Application integration - secrets must be consumable by workloads
- Disaster recovery - ability to restore secrets from backups
Considered Options
- SOPS + Age for bootstrap, Vault + External Secrets for runtime
- Sealed Secrets only
- Vault only (with Vault Agent Injector)
- SOPS only for everything
Decision Outcome
Chosen option: Option 1 - SOPS + Age for bootstrap, Vault + External Secrets for runtime
This hybrid approach uses SOPS with Age encryption for bootstrap secrets that must exist before the cluster is fully operational, and HashiCorp Vault with External Secrets Operator for runtime secrets that applications consume.
Positive Consequences
- Bootstrap secrets can be committed to Git safely (encrypted with Age)
- Vault provides centralized secret management with audit logging
- External Secrets Operator enables declarative secret sync from Vault
- Clear separation between infrastructure secrets (SOPS) and application secrets (Vault)
- Secrets are automatically synced and refreshed
Negative Consequences
- Two systems to understand and maintain
- Initial Vault setup requires manual unsealing (or auto-unseal configuration)
- Age key must be securely backed up outside the cluster
Pros and Cons of the Options
Option 1: SOPS + Age for Bootstrap, Vault + External Secrets for Runtime (Chosen)
Architecture:
Bootstrap Secrets (Git-encrypted):
.sops.yaml ──► age encryption ──► *.sops.yaml files
│
▼
Flux SOPS decryption
│
▼
Kubernetes Secrets
Runtime Secrets (Vault-managed):
Vault KV Store ◄── Manual/API ──► ExternalSecret CR
│
▼
External Secrets Operator
│
▼
Kubernetes Secrets
- Good, because bootstrap secrets (Flux, cert-manager, Cloudflare) are encrypted in Git
- Good, because Vault provides audit trail and dynamic secret generation
- Good, because External Secrets syncs secrets declaratively (GitOps-friendly)
- Good, because secrets can be rotated in Vault without Git commits
- Bad, because two systems add operational complexity
- Bad, because Vault requires storage (Raft) and HA consideration
Option 2: Sealed Secrets Only
- Good, because single tool to manage
- Good, because native Kubernetes integration
- Bad, because secrets are cluster-specific (can't reuse across clusters)
- Bad, because no central secret management or audit logging
- Bad, because no support for dynamic secrets
Option 3: Vault Only with Agent Injector
- Good, because single source of truth
- Good, because supports dynamic secrets and leases
- Bad, because requires sidecar injection (resource overhead)
- Bad, because bootstrap problem - how does Vault authenticate before secrets exist?
- Bad, because more complex application integration
Option 4: SOPS Only
- Good, because simple - everything encrypted in Git
- Good, because no external dependencies at runtime
- Bad, because all secrets in Git (even encrypted) is risky for large secrets
- Bad, because secret rotation requires Git commits
- Bad, because no audit logging
Implementation Details
SOPS Configuration
.sops.yaml at repository root:
creation_rules:
- path_regex: talos/.*\.sops\.ya?ml
age: age1... # Talos-specific key
- path_regex: (bootstrap|kubernetes)/.*\.sops\.ya?ml
age: age1... # Cluster key
Bootstrap secrets encrypted with SOPS:
bootstrap/sops-age.sops.yaml- Age private key for Fluxbootstrap/github-deploy-key.sops.yaml- Git repository accesstalos/talsecret.sops.yaml- Talos machine secrets
Vault Configuration
Deployment: HA mode with 3 replicas, Raft storage on Longhorn
# HelmRelease values
server:
ha:
enabled: true
replicas: 3
raft:
enabled: true
dataStorage:
storageClass: longhorn
size: 2Gi
Kubernetes Auth: External Secrets authenticates via ServiceAccount
# ClusterSecretStore
spec:
provider:
vault:
server: "http://vault.security.svc:8200"
path: "kv"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "external-secrets"
External Secrets Usage Pattern
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: app-credentials
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: vault
target:
name: app-credentials
data:
- secretKey: password
remoteRef:
key: kv/data/myapp
property: password
Secret Categories
| Category | Storage | Examples |
|---|---|---|
| Bootstrap | SOPS + Age | Age keys, deploy keys, Talos secrets |
| Infrastructure | Vault | Database credentials, API tokens |
| Application | Vault | Service accounts, OAuth secrets |
| Certificates | cert-manager | TLS certs (auto-generated) |
Disaster Recovery
- Age private key - Stored securely outside cluster (password manager, hardware key)
- Vault data - Backed up via Longhorn snapshots
- Unseal keys - Stored securely outside cluster (Shamir shares distributed)