68 lines
2.0 KiB
Plaintext
68 lines
2.0 KiB
Plaintext
```plaintext
|
|
%% Handler Deployment Strategy (ADR-0019)
|
|
%% C4 Component diagram showing platform layers with Ray cluster
|
|
|
|
flowchart TB
|
|
subgraph platform["🏗️ Platform Layer"]
|
|
direction LR
|
|
kubeflow["📊 Kubeflow<br/>Pipelines"]
|
|
kserve["🎯 KServe<br/>(visibility)"]
|
|
mlflow["📈 MLflow<br/>(registry)"]
|
|
end
|
|
|
|
subgraph ray["⚡ Ray Cluster"]
|
|
direction TB
|
|
|
|
subgraph gpu_apps["🎮 GPU Inference (Workers)"]
|
|
direction LR
|
|
llm["/llm<br/>vLLM<br/>🟢 khelben 0.95 GPU"]
|
|
whisper["/whisper<br/>Whisper<br/>🟡 elminster 0.5 GPU"]
|
|
tts["/tts<br/>XTTS<br/>🟡 elminster 0.5 GPU"]
|
|
embeddings["/embeddings<br/>BGE<br/>🔴 drizzt 0.8 GPU"]
|
|
reranker["/reranker<br/>BGE<br/>🔵 danilo 0.8 GPU"]
|
|
end
|
|
|
|
subgraph cpu_apps["🖥️ CPU Handlers (Head Node)"]
|
|
direction LR
|
|
chat["/chat<br/>ChatHandler<br/>0 GPU"]
|
|
voice["/voice<br/>VoiceHandler<br/>0 GPU"]
|
|
end
|
|
end
|
|
|
|
subgraph support["🔧 Supporting Services"]
|
|
direction LR
|
|
nats["📨 NATS<br/>(events)"]
|
|
milvus["🔍 Milvus<br/>(vectors)"]
|
|
valkey["💾 Valkey<br/>(cache)"]
|
|
end
|
|
|
|
subgraph pypi["📦 Package Registry"]
|
|
gitea_pypi["Gitea PyPI<br/>• handler-base<br/>• chat-handler<br/>• voice-assistant"]
|
|
end
|
|
|
|
%% Connections
|
|
kubeflow --> ray
|
|
kserve --> ray
|
|
mlflow --> ray
|
|
|
|
cpu_apps -->|"Ray internal calls"| gpu_apps
|
|
cpu_apps --> nats
|
|
cpu_apps --> milvus
|
|
cpu_apps --> valkey
|
|
|
|
gitea_pypi -->|"pip install<br/>runtime_env"| cpu_apps
|
|
|
|
classDef platform fill:#9b59b6,color:white
|
|
classDef gpu fill:#e74c3c,color:white
|
|
classDef cpu fill:#3498db,color:white
|
|
classDef support fill:#27ae60,color:white
|
|
classDef registry fill:#f39c12,color:black
|
|
|
|
class kubeflow,kserve,mlflow platform
|
|
class llm,whisper,tts,embeddings,reranker gpu
|
|
class chat,voice cpu
|
|
class nats,milvus,valkey support
|
|
class gitea_pypi registry
|
|
|
|
```
|