# ADR-0022: ntfy-Discord Bridge Service ## Status Accepted ## Context Per ADR-0021, ntfy serves as the central notification hub for the homelab. However, Discord is used for team collaboration and visibility, requiring notifications to be forwarded there as well. ntfy does not natively support Discord webhook format. Discord expects a specific JSON structure with embeds, while ntfy uses its own message format. A bridge service is needed to: 1. Subscribe to ntfy topics 2. Transform messages to Discord embed format 3. Forward to Discord webhooks ## Decision ### Architecture A dedicated Go microservice (`ntfy-discord`) will bridge ntfy to Discord: ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐ │ CI/Alertmanager │────▶│ ntfy │────▶│ ntfy App │ │ Gatus/Flux │ │ (notification │ │ (mobile) │ └─────────────────┘ │ hub) │ └─────────────┘ └────────┬─────────┘ │ SSE/JSON stream ▼ ┌──────────────────┐ ┌─────────────┐ │ ntfy-discord │────▶│ Discord │ │ (Go) │ │ Webhook │ └──────────────────┘ └─────────────┘ ``` ### Service Design **Repository**: `ntfy-discord` **Technology Stack**: - Go 1.22+ - `fsnotify` for hot reload of secrets/config - Standard library `net/http` for SSE subscription - `slog` for structured logging - Scratch/distroless base image (~10MB final image) **Why Go over Python**: - **Smaller images**: ~10MB vs ~150MB+ for Python - **Cloud native**: Single static binary, no runtime dependencies - **Memory efficient**: Lower RSS, ideal for always-on bridge - **Concurrency**: Goroutines for SSE handling and webhook delivery - **Compile-time safety**: Catch errors before deployment **Core Features**: 1. **SSE Subscription**: Connect to ntfy's JSON stream endpoint for real-time messages 2. **Automatic Reconnection**: Exponential backoff on connection failures 3. **Message Transformation**: Convert ntfy format to Discord embed format 4. **Priority Mapping**: Map ntfy priorities to Discord embed colors 5. **Topic Routing**: Configure which topics go to which Discord channels/webhooks 6. **Hot Reload**: Watch mounted secrets/configmaps with fsnotify, reload without restart 7. **Health Endpoint**: `/health` and `/ready` for Kubernetes probes 8. **Metrics**: Prometheus metrics at `/metrics` ### Hot Reload Implementation Kubernetes mounts secrets as symlinked files that update atomically. The bridge uses `fsnotify` to watch for changes: ```go // Watch for secret changes and reload config func (b *Bridge) watchSecrets(ctx context.Context, secretPath string) { watcher, _ := fsnotify.NewWatcher() defer watcher.Close() watcher.Add(secretPath) for { select { case event := <-watcher.Events: if event.Has(fsnotify.Write) || event.Has(fsnotify.Create) { slog.Info("secret changed, reloading config") b.reloadConfig(secretPath) } case <-ctx.Done(): return } } } ``` This allows ExternalSecrets to rotate the Discord webhook URL without pod restarts. ### Configuration Configuration via environment variables and mounted secrets: ```yaml # Environment variables (ConfigMap) NTFY_URL: "http://ntfy.observability.svc.cluster.local" NTFY_TOPICS: "gitea-ci,alertmanager-alerts,flux-deployments,gatus" LOG_LEVEL: "info" METRICS_ENABLED: "true" # Mounted secret (hot-reloadable) /secrets/discord-webhook-url # Single webhook for all topics # OR for topic routing: /secrets/topic-webhooks.yaml # YAML mapping topics to webhooks ``` Topic routing file (optional): ```yaml gitea-ci: "https://discord.com/api/webhooks/xxx/ci" alertmanager-alerts: "https://discord.com/api/webhooks/xxx/alerts" flux-deployments: "https://discord.com/api/webhooks/xxx/deploys" default: "https://discord.com/api/webhooks/xxx/general" ``` ### Message Transformation ntfy message: ```json { "id": "abc123", "topic": "gitea-ci", "title": "Build succeeded", "message": "ray-serve-apps published to PyPI", "priority": 3, "tags": ["package", "white_check_mark"], "time": 1770050091 } ``` Discord embed: ```json { "embeds": [{ "title": "✅ Build succeeded", "description": "ray-serve-apps published to PyPI", "color": 3066993, "fields": [ {"name": "Topic", "value": "gitea-ci", "inline": true} ], "timestamp": "2026-02-02T11:34:51Z", "footer": {"text": "ntfy"} }] } ``` **Priority → Color Mapping**: | Priority | Name | Discord Color | |----------|------|---------------| | 5 | Max/Urgent | 🔴 Red (15158332) | | 4 | High | 🟠 Orange (15105570) | | 3 | Default | 🔵 Blue (3066993) | | 2 | Low | ⚪ Gray (9807270) | | 1 | Min | ⚪ Light Gray (12370112) | **Tag → Emoji Mapping**: Common ntfy tags are converted to Discord-friendly emojis in the title: - `white_check_mark` / `heavy_check_mark` → ✅ - `x` / `skull` → ❌ - `warning` → ⚠️ - `rotating_light` → 🚨 - `rocket` → 🚀 - `package` → 📦 ### Kubernetes Deployment ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: ntfy-discord namespace: observability spec: replicas: 1 selector: matchLabels: app: ntfy-discord template: metadata: labels: app: ntfy-discord spec: containers: - name: bridge image: gitea-http.gitea.svc.cluster.local:3000/daviestechlabs/ntfy-discord:latest env: - name: NTFY_URL value: "http://ntfy.observability.svc.cluster.local" - name: NTFY_TOPICS value: "gitea-ci,alertmanager-alerts,flux-deployments" - name: SECRETS_PATH value: "/secrets" ports: - containerPort: 8080 name: http volumeMounts: - name: discord-secrets mountPath: /secrets readOnly: true livenessProbe: httpGet: path: /health port: http initialDelaySeconds: 5 periodSeconds: 30 readinessProbe: httpGet: path: /ready port: http periodSeconds: 10 resources: limits: cpu: 50m memory: 32Mi requests: cpu: 5m memory: 16Mi volumes: - name: discord-secrets secret: secretName: discord-webhook-secret ``` ### Secret Management Discord webhook URL stored in Vault at `kv/data/discord`: ```yaml apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: discord-webhook-secret namespace: observability spec: refreshInterval: 1h secretStoreRef: name: vault kind: ClusterSecretStore target: name: discord-webhook-secret data: - secretKey: webhook-url remoteRef: key: kv/data/discord property: webhook_url ``` When ExternalSecrets refreshes and updates the secret, the bridge detects the file change and reloads without restart. ### Error Handling 1. **Connection Loss**: Exponential backoff (1s, 2s, 4s, ... max 60s) 2. **Discord Rate Limits**: Respect `Retry-After` header, queue messages 3. **Invalid Messages**: Log and skip, don't crash 4. **Webhook Errors**: Log error, continue processing other messages 5. **Config Reload Errors**: Log error, keep using previous config ## Consequences ### Positive - **Tiny footprint**: ~10MB image, 16MB memory - **Hot reload**: Secrets update without pod restart - **Robust**: Proper reconnection and error handling - **Observable**: Structured logging, Prometheus metrics, health endpoints - **Fast startup**: <100ms cold start - **Cloud native**: Static binary, distroless image ### Negative - **Go learning curve**: Different patterns than Python services - **Operational Overhead**: Another service to maintain - **Latency**: Adds ~50-100ms to notification delivery ### Neutral - Webhook URL must be maintained in Vault - Service logs should be monitored for errors ## Implementation Checklist - [x] Create `ntfy-discord` repository - [ ] Implement core bridge logic - [ ] Add SSE client with reconnection - [ ] Implement message transformation - [ ] Add fsnotify hot reload for secrets - [ ] Add health/ready/metrics endpoints - [ ] Write unit tests - [ ] Create multi-stage Dockerfile (scratch base) - [ ] Set up CI/CD pipeline (Gitea Actions) - [ ] Add ExternalSecret for Discord webhook - [ ] Create Kubernetes manifests - [ ] Deploy to observability namespace - [ ] Verify notifications flowing to Discord ## Related - ADR-0021: Notification Architecture - ADR-0015: CI Notifications and Semantic Versioning