Files
homelab-design/TECH-STACK.md
Billy D. 832cda34bd feat: add comprehensive architecture documentation
- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00

8.1 KiB

🛠️ Technology Stack

Complete inventory of technologies used in the DaviesTechLabs homelab

Platform Layer

Operating System

Component Version Purpose
Talos Linux v1.12.1 Immutable, API-driven Kubernetes OS
Kernel 6.18.2-talos Linux kernel with GPU drivers

Container Orchestration

Component Version Purpose
Kubernetes v1.35.0 Container orchestration
containerd 2.1.6 Container runtime
Cilium Latest CNI, network policies, eBPF

GitOps

Component Version Purpose
Flux CD v2 GitOps continuous delivery
SOPS Latest Secret encryption
Age Latest Encryption key management

AI/ML Layer

Inference Engines

Service Framework GPU Model Type
vLLM ROCm AMD Strix Halo Large Language Models
faster-whisper CUDA NVIDIA RTX 2070 Speech-to-Text
XTTS CUDA NVIDIA RTX 2070 Text-to-Speech
BGE Embeddings ROCm AMD Radeon 680M Text Embeddings
BGE Reranker Intel Intel Arc Document Reranking

ML Serving

Component Version Purpose
KServe v0.12+ Model serving framework
Ray Serve 2.53.0 Unified inference endpoints

ML Workflows

Component Version Purpose
Kubeflow Pipelines 2.15.0 ML pipeline orchestration
Argo Workflows v3.7.8 DAG-based workflows
Argo Events Latest Event-driven triggers
MLflow 3.7.0 Experiment tracking, model registry

GPU Scheduling

Component Version Purpose
Volcano Latest GPU-aware scheduling
AMD GPU Device Plugin v1.4.1 ROCm GPU allocation
NVIDIA Device Plugin Latest CUDA GPU allocation
Node Feature Discovery v0.18.2 Hardware detection

Data Layer

Databases

Component Version Purpose
CloudNative-PG 16.11 PostgreSQL for metadata
Milvus Latest Vector database for RAG
ClickHouse Latest Analytics, access logs
Valkey Latest Redis-compatible cache

Object Storage

Component Version Purpose
MinIO Latest S3-compatible storage
Longhorn v1.10.1 Distributed block storage
NFS CSI Driver Latest Shared filesystem

Messaging

Component Version Purpose
NATS Latest Message bus
NATS JetStream Built-in Persistent streaming

Data Processing

Component Version Purpose
Apache Spark Latest Batch analytics
Apache Flink Latest Stream processing
Apache Iceberg Latest Table format
Nessie Latest Data catalog
Trino 479 SQL query engine

Application Layer

Web Frameworks

Application Language Framework Purpose
Companions Go net/http + HTMX AI chat interface
Voice WebApp Python Gradio Voice assistant UI
Various handlers Python asyncio + nats.py NATS event handlers

Frontend

Technology Purpose
HTMX Dynamic HTML updates
Alpine.js Lightweight reactivity
VRM 3D avatar rendering

Networking Layer

Ingress

Component Version Purpose
Envoy Gateway v1.6.3 Gateway API implementation
cloudflared Latest Cloudflare tunnel

DNS & Certificates

Component Version Purpose
external-dns Latest Automatic DNS management
cert-manager Latest TLS certificate automation

Service Mesh

Component Purpose
Spegel P2P container image distribution

Security Layer

Identity & Access

Component Version Purpose
Authentik 2025.12.1 Identity provider, SSO
Vault 1.21.2 Secret management
External Secrets Operator v1.3.1 Kubernetes secrets sync

Runtime Security

Component Version Purpose
Falco 0.42.1 Runtime threat detection
Cilium Network Policies Built-in Network segmentation

Backup

Component Version Purpose
Velero v1.17.1 Cluster backup/restore

Observability Layer

Metrics

Component Purpose
Prometheus Metrics collection
Grafana Dashboards & visualization

Logging

Component Version Purpose
Grafana Alloy v1.12.0 Log collection
Loki Latest Log aggregation

Tracing

Component Purpose
OpenTelemetry Collector Trace collection
Tempo/Jaeger Trace storage & query

Development Tools

Local Development

Tool Purpose
mise Tool version management
Task Task runner (Taskfile.yaml)
flux-local Local Flux testing

CI/CD

Tool Purpose
GitHub Actions CI/CD pipelines
Renovate Dependency updates

Image Building

Tool Purpose
Docker Container builds
GHCR Container registry

Media & Entertainment

Component Version Purpose
Jellyfin 10.11.5 Media server
Nextcloud 32.0.5 File sync & share
Prowlarr, Bazarr, etc. Various *arr stack
Kasm 1.18.1 Browser isolation

Python Dependencies (llm-workflows)

# Core
nats-py>=2.7.0          # NATS client
msgpack>=1.0.0          # Binary serialization
aiohttp>=3.9.0          # HTTP client

# ML/AI
pymilvus>=2.4.0         # Milvus client
sentence-transformers   # Embeddings
openai>=1.0.0           # vLLM OpenAI API

# Kubeflow
kfp>=2.12.1             # Pipeline SDK

Version Pinning Strategy

Component Type Strategy
Base images Pin major.minor
Helm charts Pin exact version
Python packages Pin minimum version
System extensions Pin via Talos schematic