# 🛠️ Technology Stack > **Complete inventory of technologies used in the DaviesTechLabs homelab** ## Platform Layer ### Operating System | Component | Version | Purpose | |-----------|---------|---------| | [Talos Linux](https://talos.dev) | v1.12.1 | Immutable, API-driven Kubernetes OS | | Kernel | 6.18.2-talos | Linux kernel with GPU drivers | ### Container Orchestration | Component | Version | Purpose | |-----------|---------|---------| | [Kubernetes](https://kubernetes.io) | v1.35.0 | Container orchestration | | [containerd](https://containerd.io) | 2.1.6 | Container runtime | | [Cilium](https://cilium.io) | Latest | CNI, network policies, eBPF | ### GitOps | Component | Version | Purpose | |-----------|---------|---------| | [Flux CD](https://fluxcd.io) | v2 | GitOps continuous delivery | | [SOPS](https://github.com/getsops/sops) | Latest | Secret encryption | | [Age](https://github.com/FiloSottile/age) | Latest | Encryption key management | --- ## AI/ML Layer ### GPU Inference (KubeRay RayService) All AI inference runs on a unified Ray Serve endpoint with fractional GPU allocation: | Service | Model | GPU Node | GPU Type | Allocation | |---------|-------|----------|----------|------------| | `/llm` | [vLLM](https://vllm.ai) (Llama 3.1 70B) | khelben | AMD Strix Halo 64GB | 0.95 GPU | | `/whisper` | [faster-whisper](https://github.com/guillaumekln/faster-whisper) v3 | elminster | NVIDIA RTX 2070 8GB | 0.5 GPU | | `/tts` | [XTTS](https://github.com/coqui-ai/TTS) | elminster | NVIDIA RTX 2070 8GB | 0.5 GPU | | `/embeddings` | [BGE-Large](https://huggingface.co/BAAI/bge-large-en-v1.5) | drizzt | AMD Radeon 680M 12GB | 0.8 GPU | | `/reranker` | [BGE-Reranker](https://huggingface.co/BAAI/bge-reranker-large) | danilo | Intel Arc 16GB | 0.8 GPU | **Endpoint**: `ai-inference-serve-svc.ai-ml.svc.cluster.local:8000/{service}` ### ML Serving Stack | Component | Version | Purpose | |-----------|---------|---------| | [KubeRay](https://ray-project.github.io/kuberay/) | 1.4+ | Ray cluster operator | | [Ray Serve](https://ray.io/serve) | 2.53.0 | Unified inference endpoints | | [KServe](https://kserve.github.io) | v0.12+ | Abstraction layer (ExternalName aliases) | ### ML Workflows | Component | Version | Purpose | |-----------|---------|---------| | [Kubeflow Pipelines](https://kubeflow.org) | 2.15.0 | ML pipeline orchestration | | [Argo Workflows](https://argoproj.github.io/workflows) | v3.7.8 | DAG-based workflows | | [Argo Events](https://argoproj.github.io/events) | Latest | Event-driven triggers | | [MLflow](https://mlflow.org) | 3.7.0 | Experiment tracking, model registry | ### GPU Scheduling | Component | Version | Purpose | |-----------|---------|---------| | [Volcano](https://volcano.sh) | Latest | GPU-aware scheduling | | AMD GPU Device Plugin | v1.4.1 | ROCm GPU allocation | | NVIDIA Device Plugin | Latest | CUDA GPU allocation | | [Node Feature Discovery](https://github.com/kubernetes-sigs/node-feature-discovery) | v0.18.2 | Hardware detection | --- ## Data Layer ### Databases | Component | Version | Purpose | |-----------|---------|---------| | [CloudNative-PG](https://cloudnative-pg.io) | 16.11 | PostgreSQL for metadata | | [Milvus](https://milvus.io) | Latest | Vector database for RAG | | [ClickHouse](https://clickhouse.com) | Latest | Analytics, access logs | | [Valkey](https://valkey.io) | Latest | Redis-compatible cache | ### Object Storage | Component | Version | Purpose | |-----------|---------|---------| | [MinIO](https://min.io) | Latest | S3-compatible storage | | [Longhorn](https://longhorn.io) | v1.10.1 | Distributed block storage | | NFS CSI Driver | Latest | Shared filesystem | ### Messaging | Component | Version | Purpose | |-----------|---------|---------| | [NATS](https://nats.io) | Latest | Message bus | | NATS JetStream | Built-in | Persistent streaming | ### Data Processing | Component | Version | Purpose | |-----------|---------|---------| | [Apache Spark](https://spark.apache.org) | Latest | Batch analytics | | [Apache Flink](https://flink.apache.org) | Latest | Stream processing | | [Apache Iceberg](https://iceberg.apache.org) | Latest | Table format | | [Nessie](https://projectnessie.org) | Latest | Data catalog | | [Trino](https://trino.io) | 479 | SQL query engine | --- ## Application Layer ### Web Frameworks | Application | Language | Framework | Purpose | |-------------|----------|-----------|---------| | Companions | Go | net/http + HTMX | AI chat interface | | Voice WebApp | Python | Gradio | Voice assistant UI | | Various handlers | Python | asyncio + nats.py | NATS event handlers | ### Frontend | Technology | Purpose | |------------|---------| | [HTMX](https://htmx.org) | Dynamic HTML updates | | [Alpine.js](https://alpinejs.dev) | Lightweight reactivity | | [VRM](https://vrm.dev) | 3D avatar rendering | --- ## Networking Layer ### Ingress | Component | Version | Purpose | |-----------|---------|---------| | [Envoy Gateway](https://gateway.envoyproxy.io) | v1.6.3 | Gateway API implementation | | [cloudflared](https://developers.cloudflare.com/cloudflare-one/connections/connect-apps) | Latest | Cloudflare tunnel | ### DNS & Certificates | Component | Version | Purpose | |-----------|---------|---------| | [external-dns](https://github.com/kubernetes-sigs/external-dns) | Latest | Automatic DNS management | | [cert-manager](https://cert-manager.io) | Latest | TLS certificate automation | ### Service Mesh | Component | Purpose | |-----------|---------| | [Spegel](https://github.com/spegel-org/spegel) | P2P container image distribution | --- ## Security Layer ### Identity & Access | Component | Version | Purpose | |-----------|---------|---------| | [Authentik](https://goauthentik.io) | 2025.12.1 | Identity provider, SSO | | [Vault](https://vaultproject.io) | 1.21.2 | Secret management | | [External Secrets Operator](https://external-secrets.io) | v1.3.1 | Kubernetes secrets sync | ### Runtime Security | Component | Version | Purpose | |-----------|---------|---------| | [Falco](https://falco.org) | 0.42.1 | Runtime threat detection | | Cilium Network Policies | Built-in | Network segmentation | ### Backup | Component | Version | Purpose | |-----------|---------|---------| | [Velero](https://velero.io) | v1.17.1 | Cluster backup/restore | --- ## Observability Layer ### Metrics | Component | Purpose | |-----------|---------| | [Prometheus](https://prometheus.io) | Metrics collection | | [Grafana](https://grafana.com) | Dashboards & visualization | ### Logging | Component | Version | Purpose | |-----------|---------|---------| | [Grafana Alloy](https://grafana.com/oss/alloy) | v1.12.0 | Log collection | | [Loki](https://grafana.com/oss/loki) | Latest | Log aggregation | ### Tracing | Component | Purpose | |-----------|---------| | [OpenTelemetry Collector](https://opentelemetry.io) | Trace collection | | Tempo/Jaeger | Trace storage & query | --- ## Development Tools ### Local Development | Tool | Purpose | |------|---------| | [mise](https://mise.jdx.dev) | Tool version management | | [Task](https://taskfile.dev) | Task runner (Taskfile.yaml) | | [flux-local](https://github.com/allenporter/flux-local) | Local Flux testing | ### CI/CD | Tool | Purpose | |------|---------| | GitHub Actions | CI/CD pipelines | | [Renovate](https://renovatebot.com) | Dependency updates | ### Image Building | Tool | Purpose | |------|---------| | Docker | Container builds | | GHCR | Container registry | --- ## Media & Entertainment | Component | Version | Purpose | |-----------|---------|---------| | [Jellyfin](https://jellyfin.org) | 10.11.5 | Media server | | [Nextcloud](https://nextcloud.com) | 32.0.5 | File sync & share | | Prowlarr, Bazarr, etc. | Various | *arr stack | | [Kasm](https://kasmweb.com) | 1.18.1 | Browser isolation | --- ## Python Dependencies (handler-base) Core library for all NATS handlers: [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) ```toml # Core nats-py>=2.7.0 # NATS client msgpack>=1.0.0 # Binary serialization httpx>=0.27.0 # HTTP client # ML/AI pymilvus>=2.4.0 # Milvus client openai>=1.0.0 # vLLM OpenAI API # Observability opentelemetry-api>=1.20.0 opentelemetry-sdk>=1.20.0 mlflow>=2.10.0 # Experiment tracking # Kubeflow (kubeflow repo) kfp>=2.12.1 # Pipeline SDK ``` --- ## Version Pinning Strategy | Component Type | Strategy | |----------------|----------| | Base images | Pin major.minor | | Helm charts | Pin exact version | | Python packages | Pin minimum version | | System extensions | Pin via Talos schematic | ## Related Documents - [ARCHITECTURE.md](ARCHITECTURE.md) - How components connect - [decisions/](decisions/) - Why we chose specific technologies