Files
homelab-design/decisions/0021-notification-architecture.md
Billy D. 3a46a98be3
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
docs: add ADR index workflow, standardize all ADR formats
- Add Gitea Action to auto-update README badges and ADR table on push
- Standardize 8 ADRs from heading-style to inline metadata format
- Add shields.io badges for ADR counts (total/accepted/proposed)
- Replace static directory listing with linked ADR table in README
- Accept ADR-0030 (MFA/YubiKey strategy)
2026-02-09 17:25:27 -05:00

4.3 KiB

Notification Architecture

  • Status: accepted
  • Date: 2026-02-04
  • Deciders: Billy
  • Technical Story: Unify notification delivery across CI, alerting, and monitoring systems

Context

The homelab infrastructure generates notifications from multiple sources:

  1. CI/CD pipelines (Gitea Actions) - build success/failure
  2. Alertmanager - Prometheus alerts for critical/warning conditions
  3. Gatus - Service health monitoring
  4. Flux - GitOps reconciliation events
  5. Service readiness - Notifications when deployments complete successfully

Currently, ntfy serves as the primary notification hub, but there are several issues:

  • Topic inconsistency: CI workflows were posting to builds while documentation (ADR-0015) specified gitea-ci
  • No Alertmanager integration: Critical Prometheus alerts had no delivery mechanism
  • No service readiness notifications: No visibility when services come online after deployment

Decision

1. ntfy as the Notification Hub

ntfy will serve as the central notification aggregation point. All internal services publish to ntfy topics via the internal Kubernetes service URL:

http://ntfy-svc.observability.svc.cluster.local/<topic>

This keeps ntfy auth-protected externally while allowing internal services to publish freely.

2. Standardized Topics

Topic Source Description
gitea-ci Gitea Actions CI/CD build notifications
alertmanager-alerts Alertmanager Prometheus critical/warning alerts
gatus Gatus Service health status changes
flux Flux GitOps reconciliation events
deployments Flux/Argo Service deployment completions

3. Alertmanager Integration

Alertmanager is configured to forward alerts to ntfy using the built-in tpl=alertmanager template:

receivers:
  - name: ntfy-critical
    webhookConfigs:
      - url: "http://ntfy-svc.observability.svc.cluster.local/alertmanager-alerts?tpl=alertmanager&priority=urgent&tags=rotating_light"
        sendResolved: true
  - name: ntfy-warning
    webhookConfigs:
      - url: "http://ntfy-svc.observability.svc.cluster.local/alertmanager-alerts?tpl=alertmanager&priority=high&tags=warning"
        sendResolved: true

Routes direct alerts based on severity:

  • severity=criticalntfy-critical receiver
  • severity=warningntfy-warning receiver

4. Service Readiness Notifications

To provide visibility when services are fully operational after deployment:

Option A: Flux Notification Controller Configure Flux's notification-controller to send alerts when Kustomizations/HelmReleases succeed:

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
  name: ntfy-deployments
spec:
  type: generic-hmac  # or generic
  address: http://ntfy-svc.observability.svc.cluster.local/deployments
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: deployment-success
spec:
  providerRef:
    name: ntfy-deployments
  eventSeverity: info
  eventSources:
    - kind: Kustomization
      name: '*'
    - kind: HelmRelease
      name: '*'
  inclusionList:
    - ".*succeeded.*"

Option B: Argo Workflows Post-Deploy Hook For Argo-managed deployments, add a notification step at workflow completion.

Recommendation: Use Flux Notification Controller (Option A) as it's already part of the GitOps stack and provides native integration.

Consequences

Positive

  • Single source of truth: All notifications flow through ntfy
  • Auth protection maintained: External ntfy access requires Authentik auth
  • Deployment visibility: Know when services are ready without watching logs
  • Consistent topic naming: All sources follow documented conventions

Negative

  • Configuration overhead: Each notification source requires explicit configuration

Neutral

  • Topic naming must be documented and followed consistently
  • Future Discord integration addressed in ADR-0022

Implementation Checklist

  • Standardize CI notifications to gitea-ci topic
  • Configure Alertmanager → ntfy for critical/warning alerts
  • Configure Flux notification-controller for deployment notifications
  • Add deployments topic subscription to ntfy app
  • ADR-0015: CI Notifications and Semantic Versioning
  • ADR-0022: ntfy-Discord Bridge Service