113 lines
3.2 KiB
Markdown
113 lines
3.2 KiB
Markdown
# Voice Assistant
|
|
|
|
End-to-end voice assistant pipeline for the DaviesTechLabs AI/ML platform.
|
|
|
|
## Components
|
|
|
|
### Real-time Handler (NATS-based)
|
|
|
|
The voice assistant service listens on NATS for audio requests and returns synthesized speech responses. It uses the [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) library for standardized NATS handling, telemetry, and health checks.
|
|
|
|
**Pipeline:** STT → Embeddings → Milvus RAG → Rerank → LLM → TTS
|
|
|
|
### Kubeflow Pipeline (Batch)
|
|
|
|
For batch processing or async workflows via Kubeflow Pipelines.
|
|
|
|
| Pipeline | Description |
|
|
|----------|-------------|
|
|
| `voice_pipeline.yaml` | Full STT → RAG → TTS pipeline |
|
|
| `rag_pipeline.yaml` | Text-only RAG pipeline |
|
|
| `tts_pipeline.yaml` | Simple text-to-speech |
|
|
|
|
```bash
|
|
# Compile pipelines
|
|
cd pipelines
|
|
pip install kfp==2.12.1
|
|
python voice_pipeline.py
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
NATS (voice.request)
|
|
│
|
|
▼
|
|
┌───────────────────┐
|
|
│ Voice Assistant │
|
|
│ Handler │
|
|
└───────────────────┘
|
|
│
|
|
├──▶ Whisper STT (elminster)
|
|
│ │
|
|
│ ▼
|
|
├──▶ BGE Embeddings (drizzt)
|
|
│ │
|
|
│ ▼
|
|
├──▶ Milvus Vector Search
|
|
│ │
|
|
│ ▼
|
|
├──▶ BGE Reranker (danilo)
|
|
│ │
|
|
│ ▼
|
|
├──▶ vLLM (khelben)
|
|
│ │
|
|
│ ▼
|
|
└──▶ XTTS TTS (elminster)
|
|
│
|
|
▼
|
|
NATS (voice.response.{id})
|
|
```
|
|
|
|
## Configuration
|
|
|
|
| Environment Variable | Default | Description |
|
|
|---------------------|---------|-------------|
|
|
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
|
|
| `WHISPER_URL` | `http://whisper-predictor.ai-ml.svc.cluster.local` | STT service |
|
|
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
|
|
| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
|
|
| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
|
|
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS service |
|
|
| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
|
|
| `COLLECTION_NAME` | `knowledge_base` | Milvus collection |
|
|
|
|
## NATS Message Format
|
|
|
|
### Request (voice.request)
|
|
|
|
```json
|
|
{
|
|
"request_id": "uuid",
|
|
"audio": "base64-encoded-audio",
|
|
"language": "en",
|
|
"collection": "knowledge_base"
|
|
}
|
|
```
|
|
|
|
### Response (voice.response.{request_id})
|
|
|
|
```json
|
|
{
|
|
"request_id": "uuid",
|
|
"transcription": "user question",
|
|
"response": "assistant answer",
|
|
"audio": "base64-encoded-audio"
|
|
}
|
|
```
|
|
|
|
## Building
|
|
|
|
```bash
|
|
docker build -t voice-assistant:latest .
|
|
|
|
# With specific handler-base tag
|
|
docker build --build-arg BASE_TAG=latest -t voice-assistant:latest .
|
|
```
|
|
|
|
## Related
|
|
|
|
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs
|
|
- [kuberay-images](https://git.daviestechlabs.io/daviestechlabs/kuberay-images) - Ray worker images
|
|
- [handler-base](https://github.com/Billy-Davies-2/llm-workflows/tree/main/handler-base) - Base handler library
|