Files
chat-handler/README.md

93 lines
2.8 KiB
Markdown

# Chat Handler
Text-based chat pipeline for the DaviesTechLabs AI/ML platform.
## Overview
A NATS-based service that handles chat completion requests with RAG (Retrieval Augmented Generation). It uses the [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) library for standardized NATS handling, telemetry, and health checks.
**Pipeline:** Query → Embeddings → Milvus → Rerank → LLM → (optional TTS)
## Architecture
```
NATS (ai.chat.request)
┌───────────────────┐
│ Chat Handler │
└───────────────────┘
├──▶ BGE Embeddings (drizzt)
│ │
│ ▼
├──▶ Milvus Vector Search
│ │
│ ▼
├──▶ BGE Reranker (danilo)
│ │
│ ▼
├──▶ vLLM (khelben)
│ │
│ ▼ (optional)
└──▶ XTTS TTS (elminster)
NATS (ai.chat.response.{id})
```
## NATS Message Format
### Request (ai.chat.request)
```json
{
"request_id": "uuid",
"query": "What is the capital of France?",
"collection": "knowledge_base",
"enable_tts": false,
"system_prompt": "Optional custom system prompt"
}
```
### Response (ai.chat.response.{request_id})
```json
{
"request_id": "uuid",
"response": "The capital of France is Paris.",
"sources": [
{"text": "Paris is the capital...", "score": 0.95}
],
"audio": "base64-encoded-audio (if TTS enabled)"
}
```
## Configuration
| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `NATS_URL` | `nats://nats.ai-ml.svc.cluster.local:4222` | NATS server |
| `EMBEDDINGS_URL` | `http://embeddings-predictor.ai-ml.svc.cluster.local` | Embeddings |
| `RERANKER_URL` | `http://reranker-predictor.ai-ml.svc.cluster.local` | Reranker |
| `VLLM_URL` | `http://llm-draft.ai-ml.svc.cluster.local:8000` | LLM service |
| `TTS_URL` | `http://tts-predictor.ai-ml.svc.cluster.local` | TTS (optional) |
| `MILVUS_HOST` | `milvus.ai-ml.svc.cluster.local` | Vector DB |
| `COLLECTION_NAME` | `knowledge_base` | Default Milvus collection |
| `ENABLE_TTS` | `false` | Enable audio responses |
## Building
```bash
docker build -t chat-handler:latest .
# With specific handler-base tag
docker build --build-arg BASE_TAG=latest -t chat-handler:latest .
```
## Related
- [handler-base](https://git.daviestechlabs.io/daviestechlabs/handler-base) - Base handler library
- [voice-assistant](https://git.daviestechlabs.io/daviestechlabs/voice-assistant) - Voice pipeline
- [homelab-design](https://git.daviestechlabs.io/daviestechlabs/homelab-design) - Architecture docs