Files

Update README with ADR Index / update-readme (push) Successful in 6s

Details

docs: add ADRs 0043-0053 covering remaining architecture gaps

New ADRs:
- 0043: Cilium CNI and Network Fabric
- 0044: DNS and External Access Architecture
- 0045: TLS Certificate Strategy (cert-manager)
- 0046: Companions Frontend Architecture
- 0047: MLflow Experiment Tracking and Model Registry
- 0048: Entertainment and Media Stack
- 0049: Self-Hosted Productivity Suite
- 0050: Argo Rollouts Progressive Delivery
- 0051: KEDA Event-Driven Autoscaling
- 0052: Cluster Utilities (Spegel, Descheduler, Reloader, CSI-NFS)
- 0053: Vaultwarden Password Management

README updated with table entries and badge count (53 total).

2026-02-09 18:37:14 -05:00

2.6 KiB

Raw Blame History

KEDA Event-Driven Autoscaling

Status: accepted
Date: 2026-02-09
Deciders: Billy
Technical Story: Scale workloads based on external event sources rather than only CPU/memory metrics

Context and Problem Statement

Kubernetes Horizontal Pod Autoscaler (HPA) scales on CPU and memory, but many homelab workloads have scaling signals from external systems — Envoy Gateway request queues, NATS queue depth, or GPU utilization. Scaling on the right signal reduces latency and avoids over-provisioning.

How do we autoscale workloads based on external metrics like message queues, HTTP request rates, and custom Prometheus queries?

Decision Drivers

Scale on NATS queue depth for inference pipelines
Scale on Envoy Gateway metrics for HTTP workloads
Prometheus integration for arbitrary custom metrics
CRD-based scalers compatible with Flux GitOps
Low resource overhead for the scaler controller itself

Considered Options

KEDA — Kubernetes Event-Driven Autoscaling
Custom HPA with Prometheus Adapter — HPA + external-metrics API
Knative Serving — Serverless autoscaler with scale-to-zero

Decision Outcome

Chosen option: KEDA, because it provides a large catalog of built-in scalers (Prometheus, NATS, HTTP), supports scale-to-zero, and integrates cleanly with existing HelmRelease/Kustomization GitOps.

Positive Consequences

60+ built-in scalers covering all homelab event sources
ScaledObject CRDs fit naturally in GitOps workflow
Scale-to-zero for bursty workloads (saves GPU resources)
ServiceMonitors for self-monitoring via Prometheus
Grafana dashboard included for visibility

Negative Consequences

Additional CRDs and controller pods
ScaledObject/TriggerAuthentication learning curve
Potential conflict with manually-defined HPAs

Deployment Configuration


Chart	`keda` OCI chart v2.19.0
Namespace	`keda`
Monitoring	ServiceMonitor enabled, Grafana dashboard provisioned
Webhooks	Enabled

Scaling Use Cases

Workload	Scaler	Signal	Target
Ray Serve inference	Prometheus	Pending request queue depth	1-4 replicas
Envoy Gateway	Prometheus	Active connections per gateway	KEDA manages envoy proxy fleet
Voice pipeline	NATS	Message queue length	0-2 replicas
Batch inference	Prometheus	Job queue size	0-N GPU pods

2.6 KiB Raw Blame History