Files
homelab-design/README.md
Billy D. 832cda34bd feat: add comprehensive architecture documentation
- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00

3.9 KiB

🏠 DaviesTechLabs Homelab Architecture

Production-grade AI/ML platform running on bare-metal Kubernetes

Talos Kubernetes Flux License

📖 Quick Navigation

Document Purpose
AGENT-ONBOARDING.md Start here if you're an AI agent
ARCHITECTURE.md High-level system overview
TECH-STACK.md Complete technology stack
DOMAIN-MODEL.md Core entities and bounded contexts
GLOSSARY.md Terminology reference
decisions/ Architecture Decision Records (ADRs)

🎯 What This Is

A comprehensive architecture documentation repository for the DaviesTechLabs homelab Kubernetes cluster, featuring:

  • AI/ML Platform: KServe inference services, RAG pipelines, voice assistants
  • Multi-GPU Support: AMD ROCm (RDNA3/Strix Halo), NVIDIA CUDA, Intel Arc
  • GitOps: Flux CD with SOPS encryption
  • Event-Driven: NATS JetStream for real-time messaging
  • ML Workflows: Kubeflow Pipelines + Argo Workflows

🖥️ Cluster Overview

Node Role Hardware GPU
storm Control Plane Intel 13th Gen Integrated
bruenor Control Plane Intel 13th Gen Integrated
catti Control Plane Intel 13th Gen Integrated
elminster Worker NVIDIA RTX 2070 8GB CUDA
khelben Worker (vLLM) AMD Strix Halo 64GB Unified
drizzt Worker AMD Radeon 680M 12GB RDNA2
danilo Worker Intel Core Ultra 9 Intel Arc

🚀 Quick Start

View Current Cluster State

# Get node status
kubectl get nodes -o wide

# View AI/ML workloads
kubectl get pods -n ai-ml

# Check KServe inference services
kubectl get inferenceservices -n ai-ml

Key Endpoints

Service URL Purpose
Kubeflow kubeflow.lab.daviestechlabs.io ML Pipeline UI
Companions companions-chat.lab.daviestechlabs.io AI Chat Interface
Voice voice.lab.daviestechlabs.io Voice Assistant
Gitea git.daviestechlabs.io Self-hosted Git

📂 Repository Structure

homelab-design/
├── README.md                          # This file
├── AGENT-ONBOARDING.md                # AI agent quick-start
├── ARCHITECTURE.md                    # High-level system overview
├── CONTEXT-DIAGRAM.mmd                # C4 Level 1 (Mermaid)
├── CONTAINER-DIAGRAM.mmd              # C4 Level 2
├── TECH-STACK.md                      # Complete tech stack
├── DOMAIN-MODEL.md                    # Core entities
├── CODING-CONVENTIONS.md              # Patterns & practices
├── GLOSSARY.md                        # Terminology
├── decisions/                         # ADRs
│   ├── 0000-template.md
│   ├── 0001-record-architecture-decisions.md
│   ├── 0002-use-talos-linux.md
│   └── ...
├── specs/                             # Feature specifications
└── diagrams/                          # Additional diagrams
Repository Purpose
homelab-k8s2 Kubernetes manifests, Flux GitOps
llm-workflows NATS handlers, Argo/KFP workflows
companions-frontend Go web server, HTMX frontend

📝 Contributing

  1. For architecture changes, create an ADR in decisions/
  2. Update relevant documentation
  3. Submit a PR with context

Last updated: 2026-02-01