All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
- ADR-0040: OPA Gatekeeper policy framework (constraint templates, progressive enforcement, warn-first strategy) - ADR-0041: Falco runtime threat detection (modern eBPF on Talos, Falcosidekick → Alertmanager integration) - ADR-0042: Trivy Operator vulnerability scanning (5 scanners enabled, ARM64 scan job scheduling, Talos adaptations) - Update ADR-0018: mark Falco as implemented, link to detailed ADRs - Update README: add 0040-0042 to ADR table, update badge counts
167 lines
7.4 KiB
Markdown
167 lines
7.4 KiB
Markdown
# OPA Gatekeeper Policy Framework
|
|
|
|
* Status: accepted
|
|
* Date: 2026-02-09
|
|
* Deciders: Billy
|
|
* Technical Story: Document the Gatekeeper policy framework, constraint templates, and progressive enforcement strategy
|
|
|
|
## Context and Problem Statement
|
|
|
|
Kubernetes has no built-in mechanism to enforce organizational policies beyond basic Pod Security Standards. Without admission control, workloads can be deployed with excessive privileges, missing labels, or no resource limits — creating operational and security risks.
|
|
|
|
How do we enforce cluster-wide policies while avoiding disruption to existing workloads during rollout?
|
|
|
|
## Decision Drivers
|
|
|
|
* Prevent privilege escalation from misconfigured pods
|
|
* Enforce consistent labelling for observability and ownership
|
|
* Require resource limits to prevent noisy-neighbor issues
|
|
* Progressive rollout — observe violations before blocking
|
|
* System namespaces and infrastructure components must be exempted
|
|
|
|
## Decision Outcome
|
|
|
|
Deploy **OPA Gatekeeper** with all constraints initially in **warn** mode, using a three-stage Flux dependency chain to ensure correct resource ordering.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌───────────────────────────────────────────────────────────┐
|
|
│ Flux Dependency Chain │
|
|
│ │
|
|
│ Stage 1: gatekeeper (controller) │
|
|
│ ↓ depends-on + healthChecks on CRDs │
|
|
│ Stage 2: constraint-templates (Rego policies) │
|
|
│ ↓ depends-on │
|
|
│ Stage 3: constraints (policy instances) │
|
|
└───────────────────────────────────────────────────────────┘
|
|
|
|
┌───────────────────────────────────────────────────────────┐
|
|
│ Admission Flow │
|
|
│ │
|
|
│ kubectl/Flux → API Server → Gatekeeper Webhook │
|
|
│ │ │
|
|
│ ┌───────┴───────┐ │
|
|
│ │ Evaluate │ │
|
|
│ │ Constraints │ │
|
|
│ └───────┬───────┘ │
|
|
│ │ │
|
|
│ ┌─────────────┼──────────────┐ │
|
|
│ ▼ ▼ ▼ │
|
|
│ warn dryrun deny │
|
|
│ (log only) (audit only) (reject) │
|
|
└───────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Deployment Configuration
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Chart** | `gatekeeper` from `https://open-policy-agent.github.io/gatekeeper/charts` |
|
|
| **Namespace** | `gatekeeper-system` |
|
|
| **Replicas** | 2 |
|
|
| **Audit interval** | 60 seconds |
|
|
| **Webhook failure policy** | `Ignore` (fail-open) |
|
|
| **Log denies** | `true` |
|
|
| **Metrics backend** | Prometheus |
|
|
|
|
The webhook uses `Ignore` failure policy to avoid breaking workloads if Gatekeeper itself is unavailable — availability takes priority over enforcement in a homelab.
|
|
|
|
### Resources
|
|
|
|
| Component | CPU Request/Limit | Memory Request/Limit |
|
|
|-----------|-------------------|----------------------|
|
|
| Controller | 100m / 1000m | 256Mi / 512Mi |
|
|
| Audit Controller | 100m / 1000m | 1Gi / 4Gi |
|
|
|
|
The audit controller requires significantly more memory because it caches cluster state for background evaluation of all existing resources.
|
|
|
|
### Exempt Namespaces (Webhook)
|
|
|
|
`kube-system`, `gatekeeper-system`, `flux-system`
|
|
|
|
## Constraint Templates
|
|
|
|
Three Rego-based constraint templates define the policy vocabulary:
|
|
|
|
### K8sPSPPrivilegedContainer
|
|
|
|
Blocks containers with `securityContext.privileged: true`. Checks all container types (containers, initContainers, ephemeralContainers). Supports `exemptImages` with wildcard prefix matching.
|
|
|
|
### K8sRequiredLabels
|
|
|
|
Requires specified labels on resources, with optional regex validation on values. Used to enforce the `app.kubernetes.io/name` convention.
|
|
|
|
### K8sContainerLimits
|
|
|
|
Requires containers to define resource limits. Parameterised for CPU and memory independently, with image exemptions.
|
|
|
|
## Constraints
|
|
|
|
All three constraints use **`enforcementAction: warn`** — violations are logged and surfaced in metrics but nothing is blocked.
|
|
|
|
### deny-privileged-containers
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Template** | `K8sPSPPrivilegedContainer` |
|
|
| **Targets** | Pods |
|
|
| **Action** | warn |
|
|
|
|
**Excluded namespaces:** kube-system, kube-public, kube-node-lease, gatekeeper-system, cilium-secrets, longhorn-system, observability, trivy-system, security, gpu-operator
|
|
|
|
**Exempt images:**
|
|
- `quay.io/cilium/*` — CNI requires privileged access
|
|
- `ghcr.io/longhorn/*` — Storage driver needs host access
|
|
- `docker.io/falcosecurity/*` — eBPF probe requires elevated privileges
|
|
- `registry.k8s.io/*` — Core Kubernetes components
|
|
- `nvcr.io/nvidia/*` — GPU operator/drivers
|
|
|
|
### require-app-labels
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Template** | `K8sRequiredLabels` |
|
|
| **Targets** | Deployments, StatefulSets, DaemonSets |
|
|
| **Action** | warn |
|
|
|
|
Requires `app.kubernetes.io/name` label. Excluded from system and infrastructure namespaces (kube-system, kube-public, kube-node-lease, gatekeeper-system, flux-system, cilium-secrets, cnpg-system).
|
|
|
|
### require-container-limits
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Template** | `K8sContainerLimits` |
|
|
| **Targets** | Pods |
|
|
| **Action** | warn |
|
|
|
|
Requires memory limits (`requireMemory: true`) but not CPU limits (`requireCPU: false`). CPU limits are intentionally not required because they can cause CPU throttling, while memory limits protect against OOM.
|
|
|
|
**Exempt images:** `registry.k8s.io/*`, `quay.io/cilium/*`, `docker.io/library/*`
|
|
|
|
## Enforcement Progression
|
|
|
|
| Phase | Action | Purpose |
|
|
|-------|--------|---------|
|
|
| Current | `warn` | Establish baseline — understand existing violations |
|
|
| Next | `dryrun` | Audit-only mode visible in compliance reports |
|
|
| Target | `deny` | Block non-compliant resources at admission |
|
|
|
|
The move to `deny` is gated on resolving the baseline violations surfaced in the warn phase.
|
|
|
|
## Observability
|
|
|
|
**ServiceMonitor:** Scrapes Gatekeeper pods (label `gatekeeper.sh/system: "yes"`), port `metrics`, 30s interval.
|
|
|
|
**Grafana dashboards:**
|
|
| Dashboard | Grafana ID | Purpose |
|
|
|-----------|------------|---------|
|
|
| Gatekeeper Overview | #15763 | Policy status, constraint health |
|
|
| Gatekeeper Violations | #14828 | Violation trends and details |
|
|
|
|
## Links
|
|
|
|
* Implements [ADR-0018](0018-security-policy-enforcement.md) (Gatekeeper component)
|
|
* [OPA Gatekeeper Documentation](https://open-policy-agent.github.io/gatekeeper/)
|
|
* [Gatekeeper Policy Library](https://open-policy-agent.github.io/gatekeeper-library/)
|