f0b626a5e782afb0cb1f30c0be4811c6c2a94dba
- voice_assistant.py: Standalone NATS handler with full RAG pipeline - voice_assistant_v2.py: Handler-base implementation - pipelines/voice_pipeline.py: KFP SDK pipeline definitions - Dockerfiles for both standalone and handler-base versions Pipeline: STT → Embeddings → Milvus → Rerank → LLM → TTS
Voice Assistant
End-to-end voice assistant pipeline for the DaviesTechLabs AI/ML platform.
Components
Real-time Handler (NATS-based)
The voice assistant service listens on NATS for audio requests and returns synthesized speech responses.
Pipeline: STT → Embeddings → Milvus RAG → Rerank → LLM → TTS
| File | Description |
|---|---|
voice_assistant.py |
Standalone handler (v1) |
voice_assistant_v2.py |
Handler using handler-base library |
Dockerfile |
Standalone image |
Dockerfile.v2 |
Handler-base image |
Kubeflow Pipeline (Batch)
For batch processing or async workflows via Kubeflow Pipelines.
| Pipeline | Description |
|---|---|
voice_pipeline.yaml |
Full STT → RAG → TTS pipeline |
rag_pipeline.yaml |
Text-only RAG pipeline |
tts_pipeline.yaml |
Simple text-to-speech |
# Compile pipelines
cd pipelines
pip install kfp==2.12.1
python voice_pipeline.py
Architecture
NATS (voice.request)
│
▼
┌───────────────────┐
│ Voice Assistant │
│ Handler │
└───────────────────┘
│
├──▶ Whisper STT (elminster)
│ │
│ ▼
├──▶ BGE Embeddings (drizzt)
│ │
│ ▼
├──▶ Milvus Vector Search
│ │
│ ▼
├──▶ BGE Reranker (danilo)
│ │
│ ▼
├──▶ vLLM (khelben)
│ │
│ ▼
└──▶ XTTS TTS (elminster)
│
▼
NATS (voice.response.{id})
Configuration
| Environment Variable | Default | Description |
|---|---|---|
NATS_URL |
nats://nats.ai-ml.svc.cluster.local:4222 |
NATS server |
WHISPER_URL |
http://whisper-predictor.ai-ml.svc.cluster.local |
STT service |
EMBEDDINGS_URL |
http://embeddings-predictor.ai-ml.svc.cluster.local |
Embeddings |
RERANKER_URL |
http://reranker-predictor.ai-ml.svc.cluster.local |
Reranker |
VLLM_URL |
http://llm-draft.ai-ml.svc.cluster.local:8000 |
LLM service |
TTS_URL |
http://tts-predictor.ai-ml.svc.cluster.local |
TTS service |
MILVUS_HOST |
milvus.ai-ml.svc.cluster.local |
Vector DB |
COLLECTION_NAME |
knowledge_base |
Milvus collection |
NATS Message Format
Request (voice.request)
{
"request_id": "uuid",
"audio": "base64-encoded-audio",
"language": "en",
"collection": "knowledge_base"
}
Response (voice.response.{request_id})
{
"request_id": "uuid",
"transcription": "user question",
"response": "assistant answer",
"audio": "base64-encoded-audio"
}
Building
# Standalone image (v1)
docker build -f Dockerfile -t voice-assistant:latest .
# Handler-base image (v2 - recommended)
docker build -f Dockerfile.v2 -t voice-assistant:v2 .
Related
- homelab-design - Architecture docs
- kuberay-images - Ray worker images
- handler-base - Base handler library
Languages
Go
61.5%
Python
35.9%
Dockerfile
2.6%