docs: add ADRs 0043-0053 covering remaining architecture gaps
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
New ADRs: - 0043: Cilium CNI and Network Fabric - 0044: DNS and External Access Architecture - 0045: TLS Certificate Strategy (cert-manager) - 0046: Companions Frontend Architecture - 0047: MLflow Experiment Tracking and Model Registry - 0048: Entertainment and Media Stack - 0049: Self-Hosted Productivity Suite - 0050: Argo Rollouts Progressive Delivery - 0051: KEDA Event-Driven Autoscaling - 0052: Cluster Utilities (Spegel, Descheduler, Reloader, CSI-NFS) - 0053: Vaultwarden Password Management README updated with table entries and badge count (53 total).
This commit is contained in:
71
decisions/0050-argo-rollouts-progressive-delivery.md
Normal file
71
decisions/0050-argo-rollouts-progressive-delivery.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Argo Rollouts Progressive Delivery
|
||||
|
||||
* Status: accepted
|
||||
* Date: 2026-02-09
|
||||
* Deciders: Billy
|
||||
* Technical Story: Enable progressive delivery (canary, blue-green) for safer deployments alongside existing Argo Workflows
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
Standard Kubernetes Deployments use a rolling update strategy that replaces all pods at once. For critical services, this creates risk — a bad deployment affects all traffic immediately. Progressive delivery allows gradual traffic shifting with automated rollback on failure.
|
||||
|
||||
How do we add progressive delivery capabilities without duplicating the existing Argo Workflows infrastructure?
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* Reduce blast radius of bad deployments
|
||||
* Automated rollback on failure metrics
|
||||
* Complement (not replace) existing GitOps deployment via Flux
|
||||
* Reuse Argo ecosystem already deployed for workflows
|
||||
* Dashboard for deployment visibility
|
||||
|
||||
## Considered Options
|
||||
|
||||
1. **Argo Rollouts** — Progressive delivery controller from Argo project
|
||||
2. **Flagger** — Flux-native progressive delivery
|
||||
3. **Istio traffic management** — Service mesh canary routing
|
||||
4. **Manual canary via Flux** — Separate canary Deployments managed by Flux
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **Argo Rollouts**, because it complements the existing Argo Workflows deployment, provides native canary and blue-green strategies, and includes a dashboard for deployment visibility.
|
||||
|
||||
### Positive Consequences
|
||||
|
||||
* Canary and blue-green deployment strategies with automated analysis
|
||||
* Integrates with Envoy Gateway for traffic splitting
|
||||
* Dashboard for real-time deployment progress
|
||||
* Same Argo ecosystem as existing Workflows (shared expertise)
|
||||
* CRD-based — works with GitOps (Flux manages Rollout resources)
|
||||
|
||||
### Negative Consequences
|
||||
|
||||
* Another CRD set to manage alongside standard Deployments
|
||||
* Not all workloads need progressive delivery (overhead for simple services)
|
||||
* Dashboard currently available only via port-forward (no ingress)
|
||||
|
||||
## Deployment Configuration
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Chart** | `argo-rollouts` from Argo HelmRepository |
|
||||
| **Namespace** | `ci-cd` |
|
||||
| **Replicas** | 1 |
|
||||
| **Dashboard** | Enabled |
|
||||
| **CRDs** | `CreateReplace` on install and upgrade |
|
||||
|
||||
Managed by Flux Kustomization with `wait: true` to ensure the controller is ready before dependent Rollout resources are applied.
|
||||
|
||||
## Use Cases
|
||||
|
||||
| Strategy | When to Use | Example |
|
||||
|----------|-------------|---------|
|
||||
| Canary | Gradual traffic shift with metric analysis | AI inference endpoint updates |
|
||||
| Blue-Green | Zero-downtime full cutover with instant rollback | Companions frontend releases |
|
||||
| Rolling (standard) | Low-risk config changes | Most infrastructure services |
|
||||
|
||||
## Links
|
||||
|
||||
* Related to [ADR-0009](0009-dual-workflow-engines.md) (Argo ecosystem)
|
||||
* Related to [ADR-0031](0031-gitea-cicd-strategy.md) (CI/CD pipeline)
|
||||
* [Argo Rollouts Documentation](https://argoproj.github.io/rollouts/)
|
||||
Reference in New Issue
Block a user