Billy D. f41198d8f2 feat: add e2e tests + benchmarks, fix config API
- e2e_test.go: full voice pipeline (STT->Embed->Rerank->LLM->TTS)
- main.go: fix config field->method references
- Benchmarks: full pipeline 481µs/op
2026-02-20 06:45:21 -05:00
2026-02-19 18:00:58 -05:00
2026-02-19 18:00:58 -05:00
2026-02-01 20:05:19 +00:00

Voice Assistant

End-to-end voice assistant pipeline for the DaviesTechLabs AI/ML platform.

Components

Real-time Handler (NATS-based)

The voice assistant service listens on NATS for audio requests and returns synthesized speech responses. It uses the handler-base library for standardized NATS handling, telemetry, and health checks.

Pipeline: STT → Embeddings → Milvus RAG → Rerank → LLM → TTS

Kubeflow Pipeline (Batch)

For batch processing or async workflows via Kubeflow Pipelines.

Pipeline Description
voice_pipeline.yaml Full STT → RAG → TTS pipeline
rag_pipeline.yaml Text-only RAG pipeline
tts_pipeline.yaml Simple text-to-speech
# Compile pipelines
cd pipelines
pip install kfp==2.12.1
python voice_pipeline.py

Architecture

NATS (voice.request)
        │
        ▼
┌───────────────────┐
│  Voice Assistant  │
│    Handler        │
└───────────────────┘
        │
        ├──▶ Whisper STT (elminster)
        │         │
        │         ▼
        ├──▶ BGE Embeddings (drizzt)
        │         │
        │         ▼
        ├──▶ Milvus Vector Search
        │         │
        │         ▼
        ├──▶ BGE Reranker (danilo)
        │         │
        │         ▼
        ├──▶ vLLM (khelben)
        │         │
        │         ▼
        └──▶ XTTS TTS (elminster)
                  │
                  ▼
         NATS (voice.response.{id})

Configuration

Environment Variable Default Description
NATS_URL nats://nats.ai-ml.svc.cluster.local:4222 NATS server
WHISPER_URL http://whisper-predictor.ai-ml.svc.cluster.local STT service
EMBEDDINGS_URL http://embeddings-predictor.ai-ml.svc.cluster.local Embeddings
RERANKER_URL http://reranker-predictor.ai-ml.svc.cluster.local Reranker
VLLM_URL http://llm-draft.ai-ml.svc.cluster.local:8000 LLM service
TTS_URL http://tts-predictor.ai-ml.svc.cluster.local TTS service
MILVUS_HOST milvus.ai-ml.svc.cluster.local Vector DB
COLLECTION_NAME knowledge_base Milvus collection

NATS Message Format

Request (voice.request)

{
  "request_id": "uuid",
  "audio": "base64-encoded-audio",
  "language": "en",
  "collection": "knowledge_base"
}

Response (voice.response.{request_id})

{
  "request_id": "uuid",
  "transcription": "user question",
  "response": "assistant answer",
  "audio": "base64-encoded-audio"
}

Building

docker build -t voice-assistant:latest .

# With specific handler-base tag
docker build --build-arg BASE_TAG=latest -t voice-assistant:latest .
Description
voice assistance please
Readme MIT 315 KiB
Languages
Go 61.5%
Python 35.9%
Dockerfile 2.6%