Files
voice-assistant/README.md
Billy D. f0b626a5e7 feat: Add voice assistant handler and Kubeflow pipeline
- voice_assistant.py: Standalone NATS handler with full RAG pipeline
- voice_assistant_v2.py: Handler-base implementation
- pipelines/voice_pipeline.py: KFP SDK pipeline definitions
- Dockerfiles for both standalone and handler-base versions

Pipeline: STT → Embeddings → Milvus → Rerank → LLM → TTS
2026-02-01 20:32:37 -05:00

121 lines
3.4 KiB
Markdown

# Voice Assistant
End-to-end voice assistant pipeline for the DaviesTechLabs AI/ML platform.
## Components
### Real-time Handler (NATS-based)
The voice assistant service listens on NATS for audio requests and returns synthesized speech responses.
**Pipeline:** STT → Embeddings → Milvus RAG → Rerank → LLM → TTS
| File | Description |
|------|-------------|
| `voice_assistant.py` | Standalone handler (v1) |
| `voice_assistant_v2.py` | Handler using handler-base library |
| `Dockerfile` | Standalone image |
| `Dockerfile.v2` | Handler-base image |
### Kubeflow Pipeline (Batch)
For batch processing or async workflows via Kubeflow Pipelines.
| Pipeline | Description |
|----------|-------------|
| `voice_pipeline.yaml` | Full STT → RAG → TTS pipeline |
| `rag_pipeline.yaml` | Text-only RAG pipeline |
| `tts_pipeline.yaml` | Simple text-to-speech |
```bash
# Compile pipelines
cd pipelines
pip install kfp==2.12.1
python voice_pipeline.py
```
## Architecture
```
NATS (voice.request)
┌───────────────────┐
│ Voice Assistant │
│ Handler │
└───────────────────┘
├──▶ Whisper STT (elminster)
│ │
│ ▼
├──▶ BGE Embeddings (drizzt)
│ │
│ ▼
├──▶ Milvus Vector Search
│ │
│ ▼
├──▶ BGE Reranker (danilo)
│ │
│ ▼
├──▶ vLLM (khelben)
│ │
│ ▼
└──▶ XTTS TTS (elminster)
NATS (voice.response.{id})
```
## Configuration
| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
| `WHISPER_URL` | `http://whisper-predictor.ai-ml.svc.cluster.local` | STT service |
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS service |
| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
| `COLLECTION_NAME` | `knowledge_base` | Milvus collection |
## NATS Message Format
### Request (voice.request)
```json
{
"request_id": "uuid",
"audio": "base64-encoded-audio",
"language": "en",
"collection": "knowledge_base"
}
```
### Response (voice.response.{request_id})
```json
{
"request_id": "uuid",
"transcription": "user question",
"response": "assistant answer",
"audio": "base64-encoded-audio"
}
```
## Building
```bash
# Standalone image (v1)
docker build -f Dockerfile -t voice-assistant:latest .
# Handler-base image (v2 - recommended)
docker build -f Dockerfile.v2 -t voice-assistant:v2 .
```
## Related
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs
- [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray worker images
- [handler-base](https://github.com/Billy-Davies-2/llm-workflows/tree/main/handler-base) - Base handler library