%% GPU Allocation Diagram %% Shows how AI workloads are distributed across GPU nodes flowchart TB subgraph khelben["🖥️ khelben (AMD Strix Halo 64GB)"] direction TB vllm["🧠 vLLM
LLM Inference
100% GPU"] end subgraph elminster["🖥️ elminster (NVIDIA RTX 2070 8GB)"] direction TB whisper["🎤 Whisper
STT
~50% GPU"] xtts["🔊 XTTS
TTS
~50% GPU"] end subgraph drizzt["🖥️ drizzt (AMD Radeon 680M 12GB)"] direction TB embeddings["📊 BGE Embeddings
Vector Encoding
~80% GPU"] end subgraph danilo["🖥️ danilo (Intel Arc)"] direction TB reranker["📋 BGE Reranker
Document Ranking
~80% GPU"] end subgraph workloads["Workload Routing"] chat["💬 Chat Request"] voice["🎤 Voice Request"] end chat --> embeddings chat --> reranker chat --> vllm voice --> whisper voice --> embeddings voice --> reranker voice --> vllm voice --> xtts classDef nvidia fill:#76B900,color:white classDef amd fill:#ED1C24,color:white classDef intel fill:#0071C5,color:white class whisper,xtts nvidia class vllm,embeddings amd class reranker intel