# TLS Certificate Strategy * Status: accepted * Date: 2026-02-09 * Deciders: Billy * Technical Story: Automate TLS certificate provisioning for both public and internal services ## Context and Problem Statement Every HTTPS service in the cluster needs a valid TLS certificate. Public services need trusted certificates (Let's Encrypt), while internal services can use self-signed certificates. Manual certificate management doesn't scale across 30+ services. How do we automate certificate issuance and renewal for both public and internal domains? ## Decision Drivers * Fully automated certificate lifecycle (issuance, renewal, rotation) * Wildcard certificates to avoid per-service certificate sprawl * DNS-01 challenge for wildcard support (HTTP-01 can't do wildcards) * Internal services need certificates too (browser warnings are unacceptable) * Zero downtime during renewal ## Decision Outcome Deploy **cert-manager** with two ClusterIssuers: Let's Encrypt (DNS-01 via Cloudflare) for public domains, and a self-signed issuer for internal domains. ## Deployment Configuration | | | |---|---| | **Chart** | `cert-manager` from `oci://quay.io/jetstack/charts/cert-manager` | | **Version** | v1.19.3 | | **Namespace** | `cert-manager` | | **Replicas** | 1 | ## Certificate Issuers ### letsencrypt-production (Public) | | | |---|---| | **Type** | ACME (Let's Encrypt) | | **Challenge** | DNS-01 via Cloudflare API | | **Nameservers** | `1.1.1.1:443`, `1.0.0.1:443` (DNS-over-HTTPS) | | **Zone** | `daviestechlabs.io` | Uses a Cloudflare API token (SOPS-encrypted) to create DNS-01 challenge TXT records. Recursive nameservers configured to use Cloudflare DoH for faster propagation checks. ### selfsigned-internal (Private) | | | |---|---| | **Type** | Self-Signed | | **Use** | `*.lab.daviestechlabs.io` internal services | Used for internal services where browser trust isn't critical (admin UIs accessed by the operator). ## Certificates | Domain | Issuer | Type | Duration | Renewal | |--------|--------|------|----------|---------| | `daviestechlabs.io` + `*.daviestechlabs.io` | letsencrypt-production | Wildcard | 90 days (LE default) | Auto | | `lab.daviestechlabs.io` + `*.lab.daviestechlabs.io` | selfsigned-internal | Wildcard | 1 year | 30 days before expiry | Wildcard certificates are used to avoid creating individual certificates per service. Both certificates are referenced by the Envoy Gateway listeners. ## Integration Points - **Cloudflare:** API token for DNS-01 challenges (stored as SOPS-encrypted Secret) - **Envoy Gateway:** References certificates in Gateway listener TLS configuration - **Flux:** Health check validates ClusterIssuer readiness before dependent resources - **Prometheus:** ServiceMonitor enabled for cert-manager metrics ## Links * Related to [ADR-0044](0044-dns-and-external-access.md) (DNS architecture) * Related to [ADR-0010](0010-use-envoy-gateway.md) (Gateway TLS listeners) * [cert-manager Documentation](https://cert-manager.io/docs/)