docs: Update for decomposed repo structure

- AGENT-ONBOARDING: New repo map with daviestechlabs Gitea repos
- TECH-STACK: Reference handler-base instead of llm-workflows
- CODING-CONVENTIONS: Update project structure for new repos
- ADR 0006: Update GitRepository examples for Gitea repos

llm-workflows has been split into:
- handler-base, chat-handler, voice-assistant
- kuberay-images, argo, kubeflow, mlflow, gradio-ui
This commit is contained in:
2026-02-02 05:58:35 -05:00
parent 832cda34bd
commit b6f7605fab
4 changed files with 95 additions and 37 deletions

View File

@@ -15,9 +15,21 @@ You are working on a **homelab Kubernetes cluster** running:
| Repo | What It Contains | When to Edit | | Repo | What It Contains | When to Edit |
|------|------------------|--------------| |------|------------------|--------------|
| `homelab-k8s2` | Kubernetes manifests, Talos config, Flux | Infrastructure changes | | `homelab-k8s2` | Kubernetes manifests, Talos config, Flux | Infrastructure changes |
| `llm-workflows` | NATS handlers, Argo/KFP workflows | Workflow/handler changes |
| `companions-frontend` | Go server, HTMX UI, VRM avatars | Frontend changes |
| `homelab-design` (this) | Architecture docs, ADRs | Design decisions | | `homelab-design` (this) | Architecture docs, ADRs | Design decisions |
| `companions-frontend` | Go server, HTMX UI, VRM avatars | Frontend changes |
### AI/ML Repos (git.daviestechlabs.io/daviestechlabs)
| Repo | Purpose |
|------|---------|
| `handler-base` | Shared Python library for NATS handlers |
| `chat-handler` | Text chat with RAG pipeline |
| `voice-assistant` | Voice pipeline (STT → RAG → LLM → TTS) |
| `kuberay-images` | GPU-specific Ray worker Docker images |
| `argo` | Argo Workflows (training, batch inference) |
| `kubeflow` | Kubeflow Pipeline definitions |
| `mlflow` | MLflow integration utilities |
| `gradio-ui` | Gradio demo apps (embeddings, STT, TTS) |
## 🏗️ System Architecture (30-Second Version) ## 🏗️ System Architecture (30-Second Version)
@@ -74,22 +86,39 @@ talos/
│ └── nvidia/nvidia-runtime.yaml │ └── nvidia/nvidia-runtime.yaml
``` ```
### Workflows (`llm-workflows`) ### AI/ML Services (Gitea daviestechlabs org)
``` ```
workflows/ # NATS handler deployments handler-base/ # Shared handler library
├── chat-handler.yaml ├── handler_base/ # Core classes
├── voice-assistant.yaml │ ├── handler.py # Base Handler class
└── pipeline-bridge.yaml │ ├── nats_client.py # NATS wrapper
│ └── clients/ # Service clients (STT, TTS, LLM, etc.)
chat-handler/ # RAG chat service
├── chat_handler_v2.py # Handler-base version
└── Dockerfile.v2
voice-assistant/ # Voice pipeline service
├── voice_assistant_v2.py # Handler-base version
└── pipelines/voice_pipeline.py
argo/ # Argo WorkflowTemplates argo/ # Argo WorkflowTemplates
├── document-ingestion.yaml
├── batch-inference.yaml ├── batch-inference.yaml
── qlora-training.yaml ── qlora-training.yaml
└── document-ingestion.yaml
pipelines/ # Kubeflow Pipeline Python kubeflow/ # Kubeflow Pipeline definitions
├── voice_pipeline.py ├── voice_pipeline.py
── document_ingestion_pipeline.py ── document_ingestion_pipeline.py
└── evaluation_pipeline.py
kuberay-images/ # GPU worker images
├── dockerfiles/
│ ├── Dockerfile.ray-worker-nvidia
│ ├── Dockerfile.ray-worker-strixhalo
│ └── Dockerfile.ray-worker-rdna2
└── ray-serve/ # Serve modules
``` ```
## 🔌 Service Endpoints (Internal) ## 🔌 Service Endpoints (Internal)
@@ -138,14 +167,24 @@ f"ai.pipeline.status.{request_id}" # Status updates
### Deploy a New AI Service ### Deploy a New AI Service
1. Create InferenceService in `homelab-k8s2/kubernetes/apps/ai-ml/kserve/` 1. Create InferenceService in `homelab-k8s2/kubernetes/apps/ai-ml/kserve/`
2. Add endpoint to `llm-workflows/config/ai-services-config.yaml` 2. Push to main → Flux deploys automatically
### Add a New NATS Handler
1. Create handler repo or add to existing (use `handler-base` library)
2. Add K8s Deployment in `homelab-k8s2/kubernetes/apps/ai-ml/`
3. Push to main → Flux deploys automatically 3. Push to main → Flux deploys automatically
### Add a New Workflow ### Add a New Argo Workflow
1. Create handler in `llm-workflows/chat-handler/` or `llm-workflows/voice-assistant/` 1. Add WorkflowTemplate to `argo/` repo
2. Add Kubernetes Deployment in `llm-workflows/workflows/` 2. Push to main → Gitea syncs to cluster
3. Push to main → Flux deploys automatically
### Add a New Kubeflow Pipeline
1. Add pipeline .py to `kubeflow/` repo
2. Compile with `python pipeline.py`
3. Upload YAML to Kubeflow UI
### Create Architecture Decision ### Create Architecture Decision

View File

@@ -25,24 +25,34 @@ kubernetes/
- Apps: lowercase with hyphens (`chat-handler`, `voice-assistant`) - Apps: lowercase with hyphens (`chat-handler`, `voice-assistant`)
- Secrets: `{app}-{type}` (e.g., `milvus-credentials`) - Secrets: `{app}-{type}` (e.g., `milvus-credentials`)
### llm-workflows (Orchestration) ### AI/ML Repos (git.daviestechlabs.io/daviestechlabs)
``` ```
workflows/ # Kubernetes Deployments for NATS handlers handler-base/ # Shared library for all handlers
├── {handler}.yaml # One file per handler ├── handler_base/
│ ├── handler.py # Base Handler class
│ ├── nats_client.py # NATS wrapper
│ ├── config.py # Pydantic Settings
│ ├── health.py # K8s probes
│ ├── telemetry.py # OpenTelemetry
│ └── clients/ # Service clients
└── pyproject.toml
chat-handler/ # Text chat service
voice-assistant/ # Voice pipeline service
├── {name}.py # Standalone version
├── {name}_v2.py # Handler-base version (preferred)
└── Dockerfile.v2
argo/ # Argo WorkflowTemplates argo/ # Argo WorkflowTemplates
├── {workflow-name}.yaml # One file per workflow ├── {workflow-name}.yaml
pipelines/ # Kubeflow Pipeline Python files kubeflow/ # Kubeflow Pipelines
├── {pipeline}_pipeline.py # Pipeline definition ├── {pipeline}_pipeline.py
└── kfp-sync-job.yaml # Upload job
{handler}/ # Python source code kuberay-images/ # GPU worker images
├── __init__.py ├── dockerfiles/
── {handler}.py # Main entry point ── ray-serve/
├── requirements.txt
└── Dockerfile
``` ```
--- ---

View File

@@ -237,20 +237,26 @@
--- ---
## Python Dependencies (llm-workflows) ## Python Dependencies (handler-base)
Core library for all NATS handlers: [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base)
```toml ```toml
# Core # Core
nats-py>=2.7.0 # NATS client nats-py>=2.7.0 # NATS client
msgpack>=1.0.0 # Binary serialization msgpack>=1.0.0 # Binary serialization
aiohttp>=3.9.0 # HTTP client httpx>=0.27.0 # HTTP client
# ML/AI # ML/AI
pymilvus>=2.4.0 # Milvus client pymilvus>=2.4.0 # Milvus client
sentence-transformers # Embeddings
openai>=1.0.0 # vLLM OpenAI API openai>=1.0.0 # vLLM OpenAI API
# Kubeflow # Observability
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0
mlflow>=2.10.0 # Experiment tracking
# Kubeflow (kubeflow repo)
kfp>=2.12.1 # Pipeline SDK kfp>=2.12.1 # Pipeline SDK
``` ```

View File

@@ -65,20 +65,23 @@ homelab-k8s2/
### Multi-Repository Sync ### Multi-Repository Sync
```yaml ```yaml
# GitRepository for llm-workflows # GitRepository for Gitea repos (daviestechlabs org)
# Examples: argo, kubeflow, chat-handler, voice-assistant
apiVersion: source.toolkit.fluxcd.io/v1 apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository kind: GitRepository
metadata: metadata:
name: llm-workflows name: argo-workflows
namespace: flux-system namespace: flux-system
spec: spec:
url: ssh://git@github.com/Billy-Davies-2/llm-workflows url: https://git.daviestechlabs.io/daviestechlabs/argo.git
ref: ref:
branch: main branch: main
secretRef: # Public repos don't need secretRef
name: github-deploy-key
``` ```
Note: The monolithic `llm-workflows` repo has been decomposed into separate repos
in the daviestechlabs Gitea organization. See AGENT-ONBOARDING.md for the full list.
### SOPS Integration ### SOPS Integration
```yaml ```yaml