From b6f7605fab19d46f1e829b010fd811564fe64dd3 Mon Sep 17 00:00:00 2001 From: "Billy D." Date: Mon, 2 Feb 2026 05:58:35 -0500 Subject: [PATCH] docs: Update for decomposed repo structure - AGENT-ONBOARDING: New repo map with daviestechlabs Gitea repos - TECH-STACK: Reference handler-base instead of llm-workflows - CODING-CONVENTIONS: Update project structure for new repos - ADR 0006: Update GitRepository examples for Gitea repos llm-workflows has been split into: - handler-base, chat-handler, voice-assistant - kuberay-images, argo, kubeflow, mlflow, gradio-ui --- AGENT-ONBOARDING.md | 71 +++++++++++++++++++++++------- CODING-CONVENTIONS.md | 34 +++++++++----- TECH-STACK.md | 14 ++++-- decisions/0006-gitops-with-flux.md | 13 +++--- 4 files changed, 95 insertions(+), 37 deletions(-) diff --git a/AGENT-ONBOARDING.md b/AGENT-ONBOARDING.md index ee43710..32d0f82 100644 --- a/AGENT-ONBOARDING.md +++ b/AGENT-ONBOARDING.md @@ -15,9 +15,21 @@ You are working on a **homelab Kubernetes cluster** running: | Repo | What It Contains | When to Edit | |------|------------------|--------------| | `homelab-k8s2` | Kubernetes manifests, Talos config, Flux | Infrastructure changes | -| `llm-workflows` | NATS handlers, Argo/KFP workflows | Workflow/handler changes | -| `companions-frontend` | Go server, HTMX UI, VRM avatars | Frontend changes | | `homelab-design` (this) | Architecture docs, ADRs | Design decisions | +| `companions-frontend` | Go server, HTMX UI, VRM avatars | Frontend changes | + +### AI/ML Repos (git.daviestechlabs.io/daviestechlabs) + +| Repo | Purpose | +|------|---------| +| `handler-base` | Shared Python library for NATS handlers | +| `chat-handler` | Text chat with RAG pipeline | +| `voice-assistant` | Voice pipeline (STT → RAG → LLM → TTS) | +| `kuberay-images` | GPU-specific Ray worker Docker images | +| `argo` | Argo Workflows (training, batch inference) | +| `kubeflow` | Kubeflow Pipeline definitions | +| `mlflow` | MLflow integration utilities | +| `gradio-ui` | Gradio demo apps (embeddings, STT, TTS) | ## 🏗️ System Architecture (30-Second Version) @@ -74,22 +86,39 @@ talos/ │ └── nvidia/nvidia-runtime.yaml ``` -### Workflows (`llm-workflows`) +### AI/ML Services (Gitea daviestechlabs org) ``` -workflows/ # NATS handler deployments -├── chat-handler.yaml -├── voice-assistant.yaml -└── pipeline-bridge.yaml +handler-base/ # Shared handler library +├── handler_base/ # Core classes +│ ├── handler.py # Base Handler class +│ ├── nats_client.py # NATS wrapper +│ └── clients/ # Service clients (STT, TTS, LLM, etc.) + +chat-handler/ # RAG chat service +├── chat_handler_v2.py # Handler-base version +└── Dockerfile.v2 + +voice-assistant/ # Voice pipeline service +├── voice_assistant_v2.py # Handler-base version +└── pipelines/voice_pipeline.py argo/ # Argo WorkflowTemplates -├── document-ingestion.yaml ├── batch-inference.yaml -└── qlora-training.yaml +├── qlora-training.yaml +└── document-ingestion.yaml -pipelines/ # Kubeflow Pipeline Python +kubeflow/ # Kubeflow Pipeline definitions ├── voice_pipeline.py -└── document_ingestion_pipeline.py +├── document_ingestion_pipeline.py +└── evaluation_pipeline.py + +kuberay-images/ # GPU worker images +├── dockerfiles/ +│ ├── Dockerfile.ray-worker-nvidia +│ ├── Dockerfile.ray-worker-strixhalo +│ └── Dockerfile.ray-worker-rdna2 +└── ray-serve/ # Serve modules ``` ## 🔌 Service Endpoints (Internal) @@ -138,14 +167,24 @@ f"ai.pipeline.status.{request_id}" # Status updates ### Deploy a New AI Service 1. Create InferenceService in `homelab-k8s2/kubernetes/apps/ai-ml/kserve/` -2. Add endpoint to `llm-workflows/config/ai-services-config.yaml` +2. Push to main → Flux deploys automatically + +### Add a New NATS Handler + +1. Create handler repo or add to existing (use `handler-base` library) +2. Add K8s Deployment in `homelab-k8s2/kubernetes/apps/ai-ml/` 3. Push to main → Flux deploys automatically -### Add a New Workflow +### Add a New Argo Workflow -1. Create handler in `llm-workflows/chat-handler/` or `llm-workflows/voice-assistant/` -2. Add Kubernetes Deployment in `llm-workflows/workflows/` -3. Push to main → Flux deploys automatically +1. Add WorkflowTemplate to `argo/` repo +2. Push to main → Gitea syncs to cluster + +### Add a New Kubeflow Pipeline + +1. Add pipeline .py to `kubeflow/` repo +2. Compile with `python pipeline.py` +3. Upload YAML to Kubeflow UI ### Create Architecture Decision diff --git a/CODING-CONVENTIONS.md b/CODING-CONVENTIONS.md index 9929c93..30aac87 100644 --- a/CODING-CONVENTIONS.md +++ b/CODING-CONVENTIONS.md @@ -25,24 +25,34 @@ kubernetes/ - Apps: lowercase with hyphens (`chat-handler`, `voice-assistant`) - Secrets: `{app}-{type}` (e.g., `milvus-credentials`) -### llm-workflows (Orchestration) +### AI/ML Repos (git.daviestechlabs.io/daviestechlabs) ``` -workflows/ # Kubernetes Deployments for NATS handlers -├── {handler}.yaml # One file per handler +handler-base/ # Shared library for all handlers +├── handler_base/ +│ ├── handler.py # Base Handler class +│ ├── nats_client.py # NATS wrapper +│ ├── config.py # Pydantic Settings +│ ├── health.py # K8s probes +│ ├── telemetry.py # OpenTelemetry +│ └── clients/ # Service clients +└── pyproject.toml + +chat-handler/ # Text chat service +voice-assistant/ # Voice pipeline service +├── {name}.py # Standalone version +├── {name}_v2.py # Handler-base version (preferred) +└── Dockerfile.v2 argo/ # Argo WorkflowTemplates -├── {workflow-name}.yaml # One file per workflow +├── {workflow-name}.yaml -pipelines/ # Kubeflow Pipeline Python files -├── {pipeline}_pipeline.py # Pipeline definition -└── kfp-sync-job.yaml # Upload job +kubeflow/ # Kubeflow Pipelines +├── {pipeline}_pipeline.py -{handler}/ # Python source code -├── __init__.py -├── {handler}.py # Main entry point -├── requirements.txt -└── Dockerfile +kuberay-images/ # GPU worker images +├── dockerfiles/ +└── ray-serve/ ``` --- diff --git a/TECH-STACK.md b/TECH-STACK.md index bbb1de8..03e5fce 100644 --- a/TECH-STACK.md +++ b/TECH-STACK.md @@ -237,20 +237,26 @@ --- -## Python Dependencies (llm-workflows) +## Python Dependencies (handler-base) + +Core library for all NATS handlers: [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) ```toml # Core nats-py>=2.7.0 # NATS client msgpack>=1.0.0 # Binary serialization -aiohttp>=3.9.0 # HTTP client +httpx>=0.27.0 # HTTP client # ML/AI pymilvus>=2.4.0 # Milvus client -sentence-transformers # Embeddings openai>=1.0.0 # vLLM OpenAI API -# Kubeflow +# Observability +opentelemetry-api>=1.20.0 +opentelemetry-sdk>=1.20.0 +mlflow>=2.10.0 # Experiment tracking + +# Kubeflow (kubeflow repo) kfp>=2.12.1 # Pipeline SDK ``` diff --git a/decisions/0006-gitops-with-flux.md b/decisions/0006-gitops-with-flux.md index 4c55e39..839bebd 100644 --- a/decisions/0006-gitops-with-flux.md +++ b/decisions/0006-gitops-with-flux.md @@ -65,20 +65,23 @@ homelab-k8s2/ ### Multi-Repository Sync ```yaml -# GitRepository for llm-workflows +# GitRepository for Gitea repos (daviestechlabs org) +# Examples: argo, kubeflow, chat-handler, voice-assistant apiVersion: source.toolkit.fluxcd.io/v1 kind: GitRepository metadata: - name: llm-workflows + name: argo-workflows namespace: flux-system spec: - url: ssh://git@github.com/Billy-Davies-2/llm-workflows + url: https://git.daviestechlabs.io/daviestechlabs/argo.git ref: branch: main - secretRef: - name: github-deploy-key + # Public repos don't need secretRef ``` +Note: The monolithic `llm-workflows` repo has been decomposed into separate repos +in the daviestechlabs Gitea organization. See AGENT-ONBOARDING.md for the full list. + ### SOPS Integration ```yaml