9.9 KiB
Internal PKI with Vault and cert-manager
- Status: accepted
- Date: 2026-02-16
- Deciders: Billy
- Technical Story: Replace self-signed internal certificates with a proper CA chain using Vault PKI
Context and Problem Statement
Internal services on *.lab.daviestechlabs.io use a selfsigned-internal ClusterIssuer. Each cert-manager Certificate gets its own unique self-signed root — there is no shared CA. This causes problems:
- Off-cluster devices (gravenhollow, candlekeep, waterdeep) have no way to obtain trusted certs
- Clients cannot verify server certs because there's no CA to trust
- RustFS on gravenhollow has a TLS cert only valid for
localhost, breaking S3 clients - No certificate chain means no ability to distribute a single CA bundle across the fleet
The homelab already runs HashiCorp Vault in HA mode (3 replicas, Raft storage) and cert-manager with Let's Encrypt for public certs. How do we issue trusted internal certificates for both in-cluster and off-cluster services?
Decision Drivers
- Vault is already deployed and used for secrets management
- cert-manager is already deployed with ClusterIssuer support
- Off-cluster devices (NAS, Mac Mini) need valid TLS certs
- Single CA root to trust across all machines
- Automated renewal for in-cluster certs via cert-manager
- Must not disrupt existing Let's Encrypt public certs
Considered Options
- Vault PKI secrets engine + cert-manager Vault ClusterIssuer
- step-ca (Smallstep) as standalone internal CA
- Keep self-signed, distribute individual certs manually
Decision Outcome
Chosen option: Option 1 — Vault PKI + cert-manager Vault ClusterIssuer, because it builds on existing infrastructure (Vault and cert-manager), provides a proper two-tier CA chain, and supports both in-cluster automated renewal and off-cluster cert issuance via the Vault API.
Positive Consequences
- Single root CA — one trust anchor for the entire homelab
- cert-manager automatically renews in-cluster certs via Vault
- Off-cluster devices request certs via
vault writeCLI - Two-tier CA (root → intermediate) follows PKI best practices
- Root CA key never leaves Vault
- Existing Let's Encrypt public certs are unaffected
Negative Consequences
- Vault becomes a dependency for internal TLS issuance
- Off-cluster cert renewal requires manual or scripted
vault write(no ACME) - CA root cert must be distributed to trust stores on all machines
- Vault PKI engine adds operational complexity
Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ Vault PKI (security namespace) │
│ │
│ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ pki/ (Root CA) │ │ pki_int/ (Intermediate) │ │
│ │ │ signs │ │ │
│ │ Homelab Root CA │──────▶│ Homelab Intermediate CA │ │
│ │ TTL: 10 years │ │ TTL: 5 years │ │
│ │ │ │ │ │
│ │ (only signs │ │ Role: lab-internal │ │
│ │ intermediates) │ │ *.lab.daviestechlabs.io │ │
│ └──────────────────────┘ │ TTL: 90 days (default) │ │
│ │ Key: EC P-256 │ │
│ └─────────┬────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │
│ │cert-manager │ │ vault write │ │ Future │ │
│ │ClusterIssuer│ │ (CLI) │ │ ACME │ │
│ │vault-internal│ │ │ │ │ │
│ └──────┬──────┘ └──────┬──────┘ └──────────┘ │
│ │ │ │
└────────────────────────────┼───────────────┼─────────────────────────────┘
│ │
┌────────────▼───┐ ┌──────▼───────────────────┐
│ In-Cluster │ │ Off-Cluster │
│ │ │ │
│ *.lab.dav... │ │ gravenhollow (RustFS) │
│ envoy-internal │ │ candlekeep (QNAP) │
│ auto-renewed │ │ waterdeep (Mac Mini) │
│ by cert-manager │ │ manual/scripted renewal │
└─────────────────┘ └───────────────────────────┘
Implementation
Vault PKI Configuration (Phases 1–4, completed)
# Phase 1: Root CA
vault secrets enable -path=pki pki
vault secrets tune -max-lease-ttl=87600h pki
vault write pki/root/generate/internal \
common_name="Homelab Root CA" issuer_name="homelab-root" ttl=87600h
vault write pki/config/urls \
issuing_certificates="http://vault.security.svc:8200/v1/pki/ca" \
crl_distribution_points="http://vault.security.svc:8200/v1/pki/crl"
# Phase 2: Intermediate CA
vault secrets enable -path=pki_int pki
vault secrets tune -max-lease-ttl=43800h pki_int
vault write -field=csr pki_int/intermediate/generate/internal \
common_name="Homelab Intermediate CA" issuer_name="homelab-intermediate" \
> /tmp/intermediate.csr
vault write -field=certificate pki/root/sign-intermediate \
issuer_ref="homelab-root" csr=@/tmp/intermediate.csr \
format=pem_bundle ttl=43800h > /tmp/intermediate.crt
vault write pki_int/intermediate/set-signed certificate=@/tmp/intermediate.crt
# Phase 3: PKI Role
vault write pki_int/roles/lab-internal \
allowed_domains="lab.daviestechlabs.io" \
allow_subdomains=true allow_bare_domains=true \
max_ttl=8760h ttl=2160h key_type=ec key_bits=256
# Phase 4: Policy and Kubernetes Auth Role
vault policy write cert-manager-pki - <<EOF
path "pki_int/sign/lab-internal" { capabilities = ["create", "update"] }
path "pki_int/issue/lab-internal" { capabilities = ["create"] }
EOF
vault write auth/kubernetes/role/cert-manager \
bound_service_account_names=cert-manager \
bound_service_account_namespaces=cert-manager \
audience="https://192.168.100.20:6443" \
policies=cert-manager-pki ttl=1h
Phase 5: Kubernetes Manifests (GitOps)
New vault-internal ClusterIssuer replaces selfsigned-internal:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: vault-internal
spec:
vault:
server: http://vault.security.svc:8200
path: pki_int/sign/lab-internal
auth:
kubernetes:
role: cert-manager
mountPath: /v1/auth/kubernetes
serviceAccountRef:
name: cert-manager
Envoy internal wildcard certificate updated to use vault-internal.
Phase 6: Off-Cluster Cert Issuance
vault write -format=json pki_int/issue/lab-internal \
common_name="gravenhollow.lab.daviestechlabs.io" \
alt_names="gravenhollow.lab.daviestechlabs.io" \
ttl=8760h
# Extract cert, key, ca_chain from JSON output
Phase 7: CA Distribution
The root CA cert is distributed to:
- Kubernetes pods: via ConfigMap
homelab-ca-bundleor cert-managerca-injector - waterdeep (macOS):
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain homelab-root-ca.crt - gravenhollow / candlekeep: installed into OS trust store
Certificate Inventory
| Service | Issuer | Renewal | Notes |
|---|---|---|---|
*.daviestechlabs.io |
letsencrypt-production |
Auto (cert-manager) | Public, unchanged |
*.lab.daviestechlabs.io |
vault-internal |
Auto (cert-manager) | Envoy internal gateway |
gravenhollow.lab.daviestechlabs.io |
vault-internal (via CLI) |
Manual/cron | RustFS S3, NFS |
candlekeep.lab.daviestechlabs.io |
vault-internal (via CLI) |
Manual/cron | QNAP NAS |
waterdeep.lab.daviestechlabs.io |
vault-internal (via CLI) |
Manual/cron | Mac Mini dev workstation |
Links
- Vault PKI Secrets Engine
- cert-manager Vault Issuer
- Related: ADR-0026 — Storage strategy (gravenhollow S3)
- Related: ADR-0037 — Node naming conventions
- Related: ADR-0059 — Mac Mini Ray worker