- Added ADR-0016: Affine email verification strategy - Moved ADRs 0019-0024 from docs/adr/ to decisions/ - Renamed to consistent format (removed ADR- prefix)
303 lines
9.1 KiB
Markdown
303 lines
9.1 KiB
Markdown
# ADR-0022: ntfy-Discord Bridge Service
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
Per ADR-0021, ntfy serves as the central notification hub for the homelab. However, Discord is used for team collaboration and visibility, requiring notifications to be forwarded there as well.
|
|
|
|
ntfy does not natively support Discord webhook format. Discord expects a specific JSON structure with embeds, while ntfy uses its own message format. A bridge service is needed to:
|
|
|
|
1. Subscribe to ntfy topics
|
|
2. Transform messages to Discord embed format
|
|
3. Forward to Discord webhooks
|
|
|
|
## Decision
|
|
|
|
### Architecture
|
|
|
|
A dedicated Go microservice (`ntfy-discord`) will bridge ntfy to Discord:
|
|
|
|
```
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────┐
|
|
│ CI/Alertmanager │────▶│ ntfy │────▶│ ntfy App │
|
|
│ Gatus/Flux │ │ (notification │ │ (mobile) │
|
|
└─────────────────┘ │ hub) │ └─────────────┘
|
|
└────────┬─────────┘
|
|
│ SSE/JSON stream
|
|
▼
|
|
┌──────────────────┐ ┌─────────────┐
|
|
│ ntfy-discord │────▶│ Discord │
|
|
│ (Go) │ │ Webhook │
|
|
└──────────────────┘ └─────────────┘
|
|
```
|
|
|
|
### Service Design
|
|
|
|
**Repository**: `ntfy-discord`
|
|
|
|
**Technology Stack**:
|
|
- Go 1.22+
|
|
- `fsnotify` for hot reload of secrets/config
|
|
- Standard library `net/http` for SSE subscription
|
|
- `slog` for structured logging
|
|
- Scratch/distroless base image (~10MB final image)
|
|
|
|
**Why Go over Python**:
|
|
- **Smaller images**: ~10MB vs ~150MB+ for Python
|
|
- **Cloud native**: Single static binary, no runtime dependencies
|
|
- **Memory efficient**: Lower RSS, ideal for always-on bridge
|
|
- **Concurrency**: Goroutines for SSE handling and webhook delivery
|
|
- **Compile-time safety**: Catch errors before deployment
|
|
|
|
**Core Features**:
|
|
|
|
1. **SSE Subscription**: Connect to ntfy's JSON stream endpoint for real-time messages
|
|
2. **Automatic Reconnection**: Exponential backoff on connection failures
|
|
3. **Message Transformation**: Convert ntfy format to Discord embed format
|
|
4. **Priority Mapping**: Map ntfy priorities to Discord embed colors
|
|
5. **Topic Routing**: Configure which topics go to which Discord channels/webhooks
|
|
6. **Hot Reload**: Watch mounted secrets/configmaps with fsnotify, reload without restart
|
|
7. **Health Endpoint**: `/health` and `/ready` for Kubernetes probes
|
|
8. **Metrics**: Prometheus metrics at `/metrics`
|
|
|
|
### Hot Reload Implementation
|
|
|
|
Kubernetes mounts secrets as symlinked files that update atomically. The bridge uses `fsnotify` to watch for changes:
|
|
|
|
```go
|
|
// Watch for secret changes and reload config
|
|
func (b *Bridge) watchSecrets(ctx context.Context, secretPath string) {
|
|
watcher, _ := fsnotify.NewWatcher()
|
|
defer watcher.Close()
|
|
|
|
watcher.Add(secretPath)
|
|
|
|
for {
|
|
select {
|
|
case event := <-watcher.Events:
|
|
if event.Has(fsnotify.Write) || event.Has(fsnotify.Create) {
|
|
slog.Info("secret changed, reloading config")
|
|
b.reloadConfig(secretPath)
|
|
}
|
|
case <-ctx.Done():
|
|
return
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
This allows ExternalSecrets to rotate the Discord webhook URL without pod restarts.
|
|
|
|
### Configuration
|
|
|
|
Configuration via environment variables and mounted secrets:
|
|
|
|
```yaml
|
|
# Environment variables (ConfigMap)
|
|
NTFY_URL: "http://ntfy.observability.svc.cluster.local"
|
|
NTFY_TOPICS: "gitea-ci,alertmanager-alerts,flux-deployments,gatus"
|
|
LOG_LEVEL: "info"
|
|
METRICS_ENABLED: "true"
|
|
|
|
# Mounted secret (hot-reloadable)
|
|
/secrets/discord-webhook-url # Single webhook for all topics
|
|
# OR for topic routing:
|
|
/secrets/topic-webhooks.yaml # YAML mapping topics to webhooks
|
|
```
|
|
|
|
Topic routing file (optional):
|
|
```yaml
|
|
gitea-ci: "https://discord.com/api/webhooks/xxx/ci"
|
|
alertmanager-alerts: "https://discord.com/api/webhooks/xxx/alerts"
|
|
flux-deployments: "https://discord.com/api/webhooks/xxx/deploys"
|
|
default: "https://discord.com/api/webhooks/xxx/general"
|
|
```
|
|
|
|
### Message Transformation
|
|
|
|
ntfy message:
|
|
```json
|
|
{
|
|
"id": "abc123",
|
|
"topic": "gitea-ci",
|
|
"title": "Build succeeded",
|
|
"message": "ray-serve-apps published to PyPI",
|
|
"priority": 3,
|
|
"tags": ["package", "white_check_mark"],
|
|
"time": 1770050091
|
|
}
|
|
```
|
|
|
|
Discord embed:
|
|
```json
|
|
{
|
|
"embeds": [{
|
|
"title": "✅ Build succeeded",
|
|
"description": "ray-serve-apps published to PyPI",
|
|
"color": 3066993,
|
|
"fields": [
|
|
{"name": "Topic", "value": "gitea-ci", "inline": true}
|
|
],
|
|
"timestamp": "2026-02-02T11:34:51Z",
|
|
"footer": {"text": "ntfy"}
|
|
}]
|
|
}
|
|
```
|
|
|
|
**Priority → Color Mapping**:
|
|
| Priority | Name | Discord Color |
|
|
|----------|------|---------------|
|
|
| 5 | Max/Urgent | 🔴 Red (15158332) |
|
|
| 4 | High | 🟠 Orange (15105570) |
|
|
| 3 | Default | 🔵 Blue (3066993) |
|
|
| 2 | Low | ⚪ Gray (9807270) |
|
|
| 1 | Min | ⚪ Light Gray (12370112) |
|
|
|
|
**Tag → Emoji Mapping**:
|
|
Common ntfy tags are converted to Discord-friendly emojis in the title:
|
|
- `white_check_mark` / `heavy_check_mark` → ✅
|
|
- `x` / `skull` → ❌
|
|
- `warning` → ⚠️
|
|
- `rotating_light` → 🚨
|
|
- `rocket` → 🚀
|
|
- `package` → 📦
|
|
|
|
### Kubernetes Deployment
|
|
|
|
```yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: ntfy-discord
|
|
namespace: observability
|
|
spec:
|
|
replicas: 1
|
|
selector:
|
|
matchLabels:
|
|
app: ntfy-discord
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: ntfy-discord
|
|
spec:
|
|
containers:
|
|
- name: bridge
|
|
image: gitea-http.gitea.svc.cluster.local:3000/daviestechlabs/ntfy-discord:latest
|
|
env:
|
|
- name: NTFY_URL
|
|
value: "http://ntfy.observability.svc.cluster.local"
|
|
- name: NTFY_TOPICS
|
|
value: "gitea-ci,alertmanager-alerts,flux-deployments"
|
|
- name: SECRETS_PATH
|
|
value: "/secrets"
|
|
ports:
|
|
- containerPort: 8080
|
|
name: http
|
|
volumeMounts:
|
|
- name: discord-secrets
|
|
mountPath: /secrets
|
|
readOnly: true
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: http
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 30
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /ready
|
|
port: http
|
|
periodSeconds: 10
|
|
resources:
|
|
limits:
|
|
cpu: 50m
|
|
memory: 32Mi
|
|
requests:
|
|
cpu: 5m
|
|
memory: 16Mi
|
|
volumes:
|
|
- name: discord-secrets
|
|
secret:
|
|
secretName: discord-webhook-secret
|
|
```
|
|
|
|
### Secret Management
|
|
|
|
Discord webhook URL stored in Vault at `kv/data/discord`:
|
|
|
|
```yaml
|
|
apiVersion: external-secrets.io/v1beta1
|
|
kind: ExternalSecret
|
|
metadata:
|
|
name: discord-webhook-secret
|
|
namespace: observability
|
|
spec:
|
|
refreshInterval: 1h
|
|
secretStoreRef:
|
|
name: vault
|
|
kind: ClusterSecretStore
|
|
target:
|
|
name: discord-webhook-secret
|
|
data:
|
|
- secretKey: webhook-url
|
|
remoteRef:
|
|
key: kv/data/discord
|
|
property: webhook_url
|
|
```
|
|
|
|
When ExternalSecrets refreshes and updates the secret, the bridge detects the file change and reloads without restart.
|
|
|
|
### Error Handling
|
|
|
|
1. **Connection Loss**: Exponential backoff (1s, 2s, 4s, ... max 60s)
|
|
2. **Discord Rate Limits**: Respect `Retry-After` header, queue messages
|
|
3. **Invalid Messages**: Log and skip, don't crash
|
|
4. **Webhook Errors**: Log error, continue processing other messages
|
|
5. **Config Reload Errors**: Log error, keep using previous config
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
- **Tiny footprint**: ~10MB image, 16MB memory
|
|
- **Hot reload**: Secrets update without pod restart
|
|
- **Robust**: Proper reconnection and error handling
|
|
- **Observable**: Structured logging, Prometheus metrics, health endpoints
|
|
- **Fast startup**: <100ms cold start
|
|
- **Cloud native**: Static binary, distroless image
|
|
|
|
### Negative
|
|
|
|
- **Go learning curve**: Different patterns than Python services
|
|
- **Operational Overhead**: Another service to maintain
|
|
- **Latency**: Adds ~50-100ms to notification delivery
|
|
|
|
### Neutral
|
|
|
|
- Webhook URL must be maintained in Vault
|
|
- Service logs should be monitored for errors
|
|
|
|
## Implementation Checklist
|
|
|
|
- [x] Create `ntfy-discord` repository
|
|
- [ ] Implement core bridge logic
|
|
- [ ] Add SSE client with reconnection
|
|
- [ ] Implement message transformation
|
|
- [ ] Add fsnotify hot reload for secrets
|
|
- [ ] Add health/ready/metrics endpoints
|
|
- [ ] Write unit tests
|
|
- [ ] Create multi-stage Dockerfile (scratch base)
|
|
- [ ] Set up CI/CD pipeline (Gitea Actions)
|
|
- [ ] Add ExternalSecret for Discord webhook
|
|
- [ ] Create Kubernetes manifests
|
|
- [ ] Deploy to observability namespace
|
|
- [ ] Verify notifications flowing to Discord
|
|
|
|
## Related
|
|
|
|
- ADR-0021: Notification Architecture
|
|
- ADR-0015: CI Notifications and Semantic Versioning
|