updates to adrs and fixing to reflect go refactor.
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m2s
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m2s
This commit is contained in:
@@ -22,13 +22,13 @@ You are working on a **homelab Kubernetes cluster** running:
|
||||
|
||||
| Repo | Purpose |
|
||||
|------|---------|
|
||||
| `handler-base` | Shared Python library for NATS handlers |
|
||||
| `chat-handler` | Text chat with RAG pipeline |
|
||||
| `voice-assistant` | Voice pipeline (STT → RAG → LLM → TTS) |
|
||||
| `handler-base` | Shared Go module for NATS handlers (protobuf, health, OTel, clients) |
|
||||
| `chat-handler` | Text chat with RAG pipeline (Go) |
|
||||
| `voice-assistant` | Voice pipeline: STT → RAG → LLM → TTS (Go) |
|
||||
| `kuberay-images` | GPU-specific Ray worker Docker images |
|
||||
| `pipeline-bridge` | Bridge between pipelines and services |
|
||||
| `stt-module` | Speech-to-text service |
|
||||
| `tts-module` | Text-to-speech service |
|
||||
| `pipeline-bridge` | Bridge between pipelines and services (Go) |
|
||||
| `stt-module` | Speech-to-text service (Go) |
|
||||
| `tts-module` | Text-to-speech service (Go) |
|
||||
| `ray-serve` | Ray Serve inference services |
|
||||
| `argo` | Argo Workflows (training, batch inference) |
|
||||
| `kubeflow` | Kubeflow Pipeline definitions |
|
||||
@@ -48,7 +48,7 @@ You are working on a **homelab Kubernetes cluster** running:
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ NATS MESSAGE BUS │
|
||||
│ Subjects: ai.chat.*, ai.voice.*, ai.pipeline.* │
|
||||
│ Format: MessagePack (binary) │
|
||||
│ Format: Protocol Buffers (binary, see ADR-0061) │
|
||||
└───────────────────────────┬─────────────────────────────────────┘
|
||||
│
|
||||
┌───────────────────┼───────────────────┐
|
||||
@@ -93,19 +93,23 @@ talos/
|
||||
### AI/ML Services (Gitea daviestechlabs org)
|
||||
|
||||
```
|
||||
handler-base/ # Shared handler library
|
||||
├── handler_base/ # Core classes
|
||||
│ ├── handler.py # Base Handler class
|
||||
│ ├── nats_client.py # NATS wrapper
|
||||
│ └── clients/ # Service clients (STT, TTS, LLM, etc.)
|
||||
handler-base/ # Shared Go module (NATS, health, OTel, protobuf)
|
||||
├── clients/ # HTTP clients (LLM, STT, TTS, embeddings, reranker)
|
||||
├── config/ # Env-based configuration (struct tags)
|
||||
├── gen/messagespb/ # Generated protobuf stubs
|
||||
├── handler/ # Typed NATS message handler
|
||||
├── health/ # HTTP health + readiness server
|
||||
└── natsutil/ # NATS publish/request with protobuf
|
||||
|
||||
chat-handler/ # RAG chat service
|
||||
├── chat_handler_v2.py # Handler-base version
|
||||
└── Dockerfile.v2
|
||||
chat-handler/ # RAG chat service (Go)
|
||||
├── main.go
|
||||
├── main_test.go
|
||||
└── Dockerfile
|
||||
|
||||
voice-assistant/ # Voice pipeline service
|
||||
├── voice_assistant_v2.py # Handler-base version
|
||||
└── pipelines/voice_pipeline.py
|
||||
voice-assistant/ # Voice pipeline service (Go)
|
||||
├── main.go
|
||||
├── main_test.go
|
||||
└── Dockerfile
|
||||
|
||||
argo/ # Argo WorkflowTemplates
|
||||
├── batch-inference.yaml
|
||||
@@ -127,8 +131,23 @@ kuberay-images/ # GPU worker images
|
||||
|
||||
## 🔌 Service Endpoints (Internal)
|
||||
|
||||
```go
|
||||
// Copy-paste ready for Go handler services
|
||||
const (
|
||||
NATSUrl = "nats://nats.ai-ml.svc.cluster.local:4222"
|
||||
VLLMUrl = "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
|
||||
WhisperUrl = "http://whisper-predictor.ai-ml.svc.cluster.local"
|
||||
TTSUrl = "http://tts-predictor.ai-ml.svc.cluster.local"
|
||||
EmbeddingsUrl = "http://embeddings-predictor.ai-ml.svc.cluster.local"
|
||||
RerankerUrl = "http://reranker-predictor.ai-ml.svc.cluster.local"
|
||||
MilvusHost = "milvus.ai-ml.svc.cluster.local"
|
||||
MilvusPort = 19530
|
||||
ValkeyUrl = "redis://valkey.ai-ml.svc.cluster.local:6379"
|
||||
)
|
||||
```
|
||||
|
||||
```python
|
||||
# Copy-paste ready for Python code
|
||||
# For Python services (Ray Serve, Kubeflow pipelines, Gradio UIs)
|
||||
NATS_URL = "nats://nats.ai-ml.svc.cluster.local:4222"
|
||||
VLLM_URL = "http://llm-draft.ai-ml.svc.cluster.local:8000/v1"
|
||||
WHISPER_URL = "http://whisper-predictor.ai-ml.svc.cluster.local"
|
||||
@@ -175,7 +194,7 @@ f"ai.pipeline.status.{request_id}" # Status updates
|
||||
|
||||
### Add a New NATS Handler
|
||||
|
||||
1. Create handler repo or add to existing (use `handler-base` library)
|
||||
1. Create Go handler repo using `handler-base` module (see [ADR-0061](decisions/0061-go-handler-refactor.md))
|
||||
2. Add K8s Deployment in `homelab-k8s2/kubernetes/apps/ai-ml/`
|
||||
3. Push to main → Flux deploys automatically
|
||||
|
||||
|
||||
Reference in New Issue
Block a user