docs(adr): finalize ADR-0021 and add ADR-0022

ADR-0021 (Accepted):
- ntfy as central notification hub
- Alertmanager integration for critical/warning alerts
- Service readiness notifications via Flux notification-controller
- Standardized topic naming

ADR-0022 (Proposed):
- ntfy-discord-bridge Python service design
- SSE subscription with reconnection logic
- Message transformation to Discord embeds
- Priority/tag to color/emoji mapping
- Kubernetes deployment with ExternalSecret for webhook
This commit is contained in:
2026-02-02 11:58:52 -05:00
parent 7b77d6c29f
commit e85deaa642
2 changed files with 284 additions and 55 deletions

View File

@@ -2,7 +2,7 @@
## Status
Proposed
Accepted
## Context
@@ -12,12 +12,13 @@ The homelab infrastructure generates notifications from multiple sources:
2. **Alertmanager** - Prometheus alerts for critical/warning conditions
3. **Gatus** - Service health monitoring
4. **Flux** - GitOps reconciliation events
5. **Service readiness** - Notifications when deployments complete successfully
Currently, ntfy serves as the primary notification hub, but there are several issues:
- **Topic inconsistency**: CI workflows were posting to `builds` while documentation (ADR-0015) specified `gitea-ci`
- **No Alertmanager integration**: Critical Prometheus alerts had no delivery mechanism
- **Discord integration desire**: Team wants notifications forwarded to Discord for visibility
- **No service readiness notifications**: No visibility when services come online after deployment
## Decision
@@ -39,6 +40,7 @@ This keeps ntfy auth-protected externally while allowing internal services to pu
| `alertmanager-alerts` | Alertmanager | Prometheus critical/warning alerts |
| `gatus` | Gatus | Service health status changes |
| `flux` | Flux | GitOps reconciliation events |
| `deployments` | Flux/Argo | Service deployment completions |
### 3. Alertmanager Integration
@@ -60,53 +62,43 @@ Routes direct alerts based on severity:
- `severity=critical``ntfy-critical` receiver
- `severity=warning``ntfy-warning` receiver
### 4. Discord Integration (Future)
### 4. Service Readiness Notifications
Discord integration will be implemented as a dedicated bridge service that:
To provide visibility when services are fully operational after deployment:
1. **Subscribes** to ntfy topics via SSE/WebSocket
2. **Transforms** ntfy message format to Discord embed format
3. **Forwards** to Discord webhook URL (stored in Vault at `kv/data/discord`)
**Option A: Flux Notification Controller**
Configure Flux's notification-controller to send alerts when Kustomizations/HelmReleases succeed:
#### Design Options
**Option A: Sidecar Container (Simple)**
- Alpine container with curl/jq
- Subscribes to ntfy JSON stream
- Transforms and POSTs to Discord
- Pros: Simple, no custom code
- Cons: Shell script fragility, limited error handling
**Option B: Dedicated Python Service (Recommended)**
- Small Python service using `httpx` or `aiohttp`
- Proper reconnection logic and error handling
- Configurable topic-to-channel mapping
- Health endpoint for monitoring
- Pros: Robust, testable, maintainable
- Cons: Requires building/publishing container image
**Option C: ntfy Actions (Limited)**
- Configure ntfy server with `upstream-base-url` or actions
- Pros: Built into ntfy
- Cons: ntfy doesn't natively support Discord webhook format
#### Recommended Architecture (Option B)
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ CI/Alertmanager │────▶│ ntfy │────▶│ ntfy App │
│ Gatus/Flux │ │ (notification │ │ (mobile) │
└─────────────────┘ │ hub) │ └─────────────┘
└────────┬─────────┘
│ SSE subscribe
┌──────────────────┐ ┌─────────────┐
│ ntfy-discord- │────▶│ Discord │
│ bridge │ │ (webhook) │
└──────────────────┘ └─────────────┘
```yaml
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: ntfy-deployments
spec:
type: generic-hmac # or generic
address: http://ntfy-svc.observability.svc.cluster.local/deployments
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: deployment-success
spec:
providerRef:
name: ntfy-deployments
eventSeverity: info
eventSources:
- kind: Kustomization
name: '*'
- kind: HelmRelease
name: '*'
inclusionList:
- ".*succeeded.*"
```
The bridge service would be a new repo (`ntfy-discord-bridge`) following the same patterns as other Python services in the homelab.
**Option B: Argo Workflows Post-Deploy Hook**
For Argo-managed deployments, add a notification step at workflow completion.
**Recommendation**: Use Flux Notification Controller (Option A) as it's already part of the GitOps stack and provides native integration.
## Consequences
@@ -114,30 +106,26 @@ The bridge service would be a new repo (`ntfy-discord-bridge`) following the sam
- **Single source of truth**: All notifications flow through ntfy
- **Auth protection maintained**: External ntfy access requires Authentik auth
- **Flexible routing**: Can subscribe to specific topics per destination
- **Separation of concerns**: Discord bridge is independent, can be disabled without affecting ntfy
- **Deployment visibility**: Know when services are ready without watching logs
- **Consistent topic naming**: All sources follow documented conventions
### Negative
- **Additional service**: Discord bridge adds operational overhead
- **Latency**: Two-hop delivery (source → ntfy → Discord) adds minimal latency
- **Configuration overhead**: Each notification source requires explicit configuration
### Neutral
- Topic naming must be documented and followed consistently
- Discord webhook URL must be maintained in Vault
- Future Discord integration addressed in ADR-0022
## Implementation Checklist
- [x] Standardize CI notifications to `gitea-ci` topic
- [x] Configure Alertmanager → ntfy for critical/warning alerts
- [ ] Create `ntfy-discord-bridge` repository
- [ ] Implement bridge service with proper error handling
- [ ] Add ExternalSecret for Discord webhook from Vault
- [ ] Deploy bridge to observability namespace
- [ ] Document topic-to-Discord-channel mapping
- [ ] Configure Flux notification-controller for deployment notifications
- [ ] Add `deployments` topic subscription to ntfy app
## Related
- ADR-0015: CI Notifications and Semantic Versioning
- ADR-0020: Internal Registry for CI/CD
- ADR-0022: ntfy-Discord Bridge Service