Files
homelab-design/diagrams/data-flow-voice.mmd
Billy D. 832cda34bd feat: add comprehensive architecture documentation
- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00

47 lines
1.2 KiB
Plaintext

%% Voice Request Data Flow
%% Sequence diagram showing voice assistant processing
sequenceDiagram
autonumber
participant U as User
participant W as Voice WebApp
participant N as NATS
participant VA as Voice Assistant
participant STT as Whisper<br/>(STT)
participant E as BGE Embeddings
participant M as Milvus
participant R as Reranker
participant L as vLLM
participant TTS as XTTS<br/>(TTS)
U->>W: Record audio
W->>N: Publish ai.voice.user.{id}.request<br/>(msgpack with audio bytes)
N->>VA: Deliver voice request
VA->>STT: Transcribe audio
STT-->>VA: Transcription text
alt RAG Enabled
VA->>E: Generate query embedding
E-->>VA: Query vector
VA->>M: Search similar chunks
M-->>VA: Top-K chunks
opt Reranker Enabled
VA->>R: Rerank chunks
R-->>VA: Reordered chunks
end
end
VA->>L: LLM inference
L-->>VA: Response text
VA->>TTS: Synthesize speech
TTS-->>VA: Audio bytes
VA->>N: Publish ai.voice.response.{id}<br/>(text + audio)
N-->>W: Deliver response
W-->>U: Play audio + show text
Note over VA,TTS: Total latency target: < 3s