updates to adrs and fixing to reflect go refactor.
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m2s

This commit is contained in:
2026-02-23 06:14:23 -05:00
parent f19fa3e969
commit 100ba21eba
7 changed files with 181 additions and 129 deletions

View File

@@ -22,13 +22,13 @@ You are working on a **homelab Kubernetes cluster** running:
| Repo | Purpose |
|------|---------|
| `handler-base` | Shared Python library for NATS handlers |
| `chat-handler` | Text chat with RAG pipeline |
| `voice-assistant` | Voice pipeline (STT → RAG → LLM → TTS) |
| `handler-base` | Shared Go module for NATS handlers (protobuf, health, OTel, clients) |
| `chat-handler` | Text chat with RAG pipeline (Go) |
| `voice-assistant` | Voice pipeline: STT → RAG → LLM → TTS (Go) |
| `kuberay-images` | GPU-specific Ray worker Docker images |
| `pipeline-bridge` | Bridge between pipelines and services |
| `stt-module` | Speech-to-text service |
| `tts-module` | Text-to-speech service |
| `pipeline-bridge` | Bridge between pipelines and services (Go) |
| `stt-module` | Speech-to-text service (Go) |
| `tts-module` | Text-to-speech service (Go) |
| `ray-serve` | Ray Serve inference services |
| `argo` | Argo Workflows (training, batch inference) |
| `kubeflow` | Kubeflow Pipeline definitions |
@@ -48,7 +48,7 @@ You are working on a **homelab Kubernetes cluster** running:
┌─────────────────────────────────────────────────────────────────┐
│ NATS MESSAGE BUS │
│ Subjects: ai.chat.*, ai.voice.*, ai.pipeline.* │
│ Format: MessagePack (binary)
│ Format: Protocol Buffers (binary, see ADR-0061)
└───────────────────────────┬─────────────────────────────────────┘
┌───────────────────┼───────────────────┐
@@ -93,19 +93,23 @@ talos/
### AI/ML Services (Gitea daviestechlabs org)
```
handler-base/ # Shared handler library
├── handler_base/ # Core classes
│ ├── handler.py # Base Handler class
│ ├── nats_client.py # NATS wrapper
│ └── clients/ # Service clients (STT, TTS, LLM, etc.)
handler-base/ # Shared Go module (NATS, health, OTel, protobuf)
├── clients/ # HTTP clients (LLM, STT, TTS, embeddings, reranker)
├── config/ # Env-based configuration (struct tags)
├── gen/messagespb/ # Generated protobuf stubs
├── handler/ # Typed NATS message handler
├── health/ # HTTP health + readiness server
└── natsutil/ # NATS publish/request with protobuf
chat-handler/ # RAG chat service
├── chat_handler_v2.py # Handler-base version
── Dockerfile.v2
chat-handler/ # RAG chat service (Go)
├── main.go
── main_test.go
└── Dockerfile
voice-assistant/ # Voice pipeline service
├── voice_assistant_v2.py # Handler-base version
── pipelines/voice_pipeline.py
voice-assistant/ # Voice pipeline service (Go)
├── main.go
── main_test.go
└── Dockerfile
argo/ # Argo WorkflowTemplates
├── batch-inference.yaml
@@ -127,8 +131,23 @@ kuberay-images/ # GPU worker images
## 🔌 Service Endpoints (Internal)
```go
// Copy-paste ready for Go handler services
const (
NATSUrl = "nats://nats.ai-ml.svc.cluster.local:4222"
VLLMUrl = "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
WhisperUrl = "http://whisper-predictor.ai-ml.svc.cluster.local"
TTSUrl = "http://tts-predictor.ai-ml.svc.cluster.local"
EmbeddingsUrl = "http://embeddings-predictor.ai-ml.svc.cluster.local"
RerankerUrl = "http://reranker-predictor.ai-ml.svc.cluster.local"
MilvusHost = "milvus.ai-ml.svc.cluster.local"
MilvusPort = 19530
ValkeyUrl = "redis://valkey.ai-ml.svc.cluster.local:6379"
)
```
```python
# Copy-paste ready for Python code
# For Python services (Ray Serve, Kubeflow pipelines, Gradio UIs)
NATS_URL = "nats://nats.ai-ml.svc.cluster.local:4222"
VLLM_URL = "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
WHISPER_URL = "http://whisper-predictor.ai-ml.svc.cluster.local"
@@ -175,7 +194,7 @@ f"ai.pipeline.status.{request_id}" # Status updates
### Add a New NATS Handler
1. Create handler repo or add to existing (use `handler-base` library)
1. Create Go handler repo using `handler-base` module (see [ADR-0061](decisions/0061-go-handler-refactor.md))
2. Add K8s Deployment in `homelab-k8s2/kubernetes/apps/ai-ml/`
3. Push to main → Flux deploys automatically