# Voice Assistant End-to-end voice assistant pipeline for the DaviesTechLabs AI/ML platform. ## Components ### Real-time Handler (NATS-based) The voice assistant service listens on NATS for audio requests and returns synthesized speech responses. **Pipeline:** STT → Embeddings → Milvus RAG → Rerank → LLM → TTS | File | Description | |------|-------------| | `voice_assistant.py` | Standalone handler (v1) | | `voice_assistant_v2.py` | Handler using handler-base library | | `Dockerfile` | Standalone image | | `Dockerfile.v2` | Handler-base image | ### Kubeflow Pipeline (Batch) For batch processing or async workflows via Kubeflow Pipelines. | Pipeline | Description | |----------|-------------| | `voice_pipeline.yaml` | Full STT → RAG → TTS pipeline | | `rag_pipeline.yaml` | Text-only RAG pipeline | | `tts_pipeline.yaml` | Simple text-to-speech | ```bash # Compile pipelines cd pipelines pip install kfp==2.12.1 python voice_pipeline.py ``` ## Architecture ``` NATS (voice.request) │ ▼ ┌───────────────────┐ │ Voice Assistant │ │ Handler │ └───────────────────┘ │ ├──▶ Whisper STT (elminster) │ │ │ ▼ ├──▶ BGE Embeddings (drizzt) │ │ │ ▼ ├──▶ Milvus Vector Search │ │ │ ▼ ├──▶ BGE Reranker (danilo) │ │ │ ▼ ├──▶ vLLM (khelben) │ │ │ ▼ └──▶ XTTS TTS (elminster) │ ▼ NATS (voice.response.{id}) ``` ## Configuration | Environment Variable | Default | Description | |---------------------|---------|-------------| | `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server | | `WHISPER_URL` | `http://whisper-predictor.ai-ml.svc.cluster.local` | STT service | | `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings | | `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker | | `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service | | `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS service | | `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB | | `COLLECTION_NAME` | `knowledge_base` | Milvus collection | ## NATS Message Format ### Request (voice.request) ```json { "request_id": "uuid", "audio": "base64-encoded-audio", "language": "en", "collection": "knowledge_base" } ``` ### Response (voice.response.{request_id}) ```json { "request_id": "uuid", "transcription": "user question", "response": "assistant answer", "audio": "base64-encoded-audio" } ``` ## Building ```bash # Standalone image (v1) docker build -f Dockerfile -t voice-assistant:latest . # Handler-base image (v2 - recommended) docker build -f Dockerfile.v2 -t voice-assistant:v2 . ``` ## Related - [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs - [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray worker images - [handler-base](https://github.com/Billy-Davies-2/llm-workflows/tree/main/handler-base) - Base handler library